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ABSTRACT 


Monte  Carlo  simulation  and  a multivariate  normal 
approximation  are  used  to  find  the  OC  and  ASN  of  the 
sequential  analysis  of  variance  test.  It  is  shown 
that  the  SAN'OVA  test  is  remarkably  robust. 
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SEQUENTIAL  FIXED  EFFECTS 
ONE-WAY  ANALYSIS 
OF  VARIANCE 


1.0  INTRODUCTION 


I 


This  first  chapter  of  the  thesis  will  consider 
both  the  fixed  and  sequential  analysis  of  variance 
tests.  For  the  fixed  sample  test  the  discussions  consist 
of  the  statistical  model,  the  optimum  properties  of  the 
test,  and  the  operating  characteristic  (OC)  function. 
Each  of  these  concepts  is  important  for  the  consideration 
of  the  sequential  analysis  of  variance  test. 

The  sequential  analysis  of  variance  test  (termed 
SANOVA)  is  first  discussed  from  a historical  perspective. 
Further  discussions  consist  of  the  experimental  procedure, 
the  test  statistic,  and  the  test  statistic  decision  rule 


(ASN) 

extremely 
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1.1  ONE-WAY  FIXED  EFFECTS  ANALYSIS  OF  VARIANCE 


Analysis  of  variance,  a term  introduced  into  statis- 
tics by  R.A.  Fisher  (1918,  1925,  1935),  is  a statistical 
technique  for  analyzing  measurements  depending  upon 
several  .kinds  of  effects  operating  simultaneously.  In 
general,  this  technique  consists  of  a body  of  tests  of 
hypotheses,  methods  of  estimation,  etc.,  using  statistics 
which  are  linear  combinations  of  sums  of  squares  of  linear 
functions  of  the  observed  measurements.  The  simplest  case 
in  which  analysis  of  variance  is  applied,  is  the  one-way 
classification,  in  which  the  observations  depend  upon  only 
one  factor. 

In  the  one-way  layout,  a population  is  stratified  into 

m subpopulations  according  to  some  characteristic  or  factor 

and  n^  independent  observations  are  taken  from  each  of  k 

of  the  m subpopulations  (i  = l,...,k).  Let  the  jth 

observation  from  population  i be  denoted  by  x. . where 

1 3 

i = l,...,k  and  j = l,...,n^.  Given  that  population  i 
has  mean  p+cr  and  standard  deviation  o^,  the  statistical 
model  employed  in  the  one-way  layout  is 


xij  = y + 0 + eij  ' 1 = ^ = 1 ' ’ * " ' ni 


I 
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with  the  parameters  6^,... ,6^  satisfying  the  following 
condition 


I t 


L 


nJ_61+  • • ■ = 0 

The  parameter  6^  is  referred  to  as  the  differential 
effect  due  to  the  factor  at  level  i. 

The  usual  hypothesis  of  interest  is  whether 
5,  = = . . = 5,  =0,  which  is  equivalent  to  the  hypo- 

thesis  of  the  equality  of  the  k means.  The  analysis  of 
the  effect  of  the  factor  depends  upon  whether  k < m or 
k = m.  Eisenhardt  (1947)  was  the  first  to  differentiate 
between  the  two  situations.  He  used  the  terms  Model  I or 
a fixed  effects  model  as  the  case  where  the  sample  con- 
sists of  all  groups  in  the  population,  i.e.,  k=  m,  and 
Model  II  or  a random  effects  model  as  the  case  where  the 
interest  is  in  the  population  from  which  the  sample  came, 
i.e.,  k < m.  This  thesis  will  be  concerned  with  only 
fixed-effects  one-way  analysis  of  variance. 

The  analysis  of  variance  technique  requires  several 
assumptions.  Specifically,  it  is  assumed  that  the 
observations  from  each  of  the  subpopulations  are  random 
variables  distributed  normally  with  mean  y + 6^  and  stan- 
dard deviation  a = o.  for  all  i.  In  other  words,  the 

l 

model  may  be  expressed  as 
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x..  = u + 6 • + e . • i=  1 , . . . , k ; j = 

i]  i ID 

x^  'v  N (u  + 5^,  a) 
e % N(0,  a) 

**■  J 

and 

cov ( x i j / x£m)  = 0. 

With  this  model  the  hypotheses 

H q : u,  = y ^ = **•  vs.  Hn  : not  all  means  equal 
can  be  tested  with  the  following  statistic 


'cal  = .1  Hi 2/<k-i> 
1 = 1 


k nL 

l i (X i -x.)2/(N-k) 
i=l  j=l  J 


where 


N = 


) n. 
i-1  1 


1 V1 

Xi  = — l.  Xij 

i 3 = 1 

= T k -i 

* ‘ T Ji  ill 


This  statistic  can  be  shown  (Kempthorne,  1952)  to  be  dis- 
tributed as  a noncentral  F variate  with  (k-1,  N-k)  degrees 
of  freedom  and  noncentrality  parameter  n* , where 

k k _ . k 

A = I 6.  n.  = l (yi-u)  n with  y « T j ^ 
i=l  i=l  i=l 


and  n = 


k 3 = 1 


l.H 


I 

I 

I 
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The  density  fu.  jtion  of  a noncentral  F variate  with  v, ,v 
degrees  of  freedom,  and  noncentrality  parameter  X is  given  by 


V •»  t v 2 / A 


(X)  = 


-JjX  %v,  ^v.,  Hv  -1 

e ‘ v1^lv22x2l 

B(?svi,  5sv2)  (v2  + vix)1^'14’V2^ 


E 

j=0 


hXv  x 


v2  + vxx 


r{}a(2j  + vx  + v2)} 
j!T(%v  ) r»s(2j  + v : 


(Johnson  and  Kotz,  1970).  (1.1.1) 

If  the  null  hypothesis  is  true,  the  distribution  of 

F . is  a central  F distribution  with  k-1,  N-k  degrees  of 
cal 

freedom.  Hence,  if  the  hypothesis  is  rejected  whenever 
F ..  is  greater  than  the  1 0 0 ( 1— cx ) % point  of  this  distri- 
bution,  that  is 

★ 

‘cal  ' Fk-l,N-k, 1-a 

then  the  significance  level  of  the  test  will  be  a. 

The  operating  characteristic  curve  of  the  test,  that 

is,  the  probability  of  accepting  is  given  by 

★ 

Pr)F  , < F,  . . . \ . Since  F . n,  F,  . „ , r the 

. cal  — k-1, N-k, l-o J caJ  v-1 _n- 


k-1 , N-k , n. 


OC  of  the  test  is  characterized  by  the  parameter  r = nX,  i.e. 

0C(X)  = Pr{F,  . „ . r < F.  . . . } 

1 k-1, N-k, K ~ k-1, N-k, 1-a' 

Several  sets  of  tables  and  curves  have  been  prepared 

from  which  the  OC  curve  for  selected  tests  can  be  obtained 

(Tang  1938,  Pearson  and  Hartley  1951,  Lehmer  _944  , Fox 


♦ 


1956,  Fix  1949).  Most  of  these  tables  are  entered  with  a 
different  parameter  than  K • Appendix  A contains  a 
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computer  program  which  will  calculate  the  OC  curve  (as  a 
function  of  X)  for  any  given  test. 

Originally  ANQVA  was  derived  from  a distributional 
point  of  view,  but  the  F-test  has  been  found  to  possess 
several  optimum  properties.  Hsu  (1941'  showed  that  the 
F-test  is  UMP  amongst  ail  tests  of  size  a whose  power 
depends  upon  X,  and  Wald  (1942a)  proved  that  the  F-test  is 
best  when  one  is  interested  uniformly  in  all  alternatives, 
as  expressed  by  uniform  weighting  on  spheres.  As  far  as 
ANOVA  is  concerned  it  is  immaterial  whether  the  value  of 
X is  built  up  by  a number  of  small  contributions  or  a 
single  large  one.  Situations  where  instead  the  main  empha- 
sis is  on  detection  of  large  deviations  should  not  use 
AN'OVA  since  the  test  is  no  longer  optimum  in  these  cases. 

1.2  SEQUENTIAL  ONE-WAY  FIXED  EFFECTS  ANALYSIS  OF  VARIANCE 

Wald  (1947)  first  presented,  and  systematically 
studied,  the  sequential  test  of  a simple  hypothesis  against 
a simple  alternative.  Let  Hq  denote  the  hypothesis  that 
the  population  density  is  fg (x)  , and  H,  the  hypothesis  that 
it  is  f^(x).  Constants  A and  B are  chosen  (A  > B) , and 
after  each  obse.vation  in  a sequence  the  corresponding 
likelihood  ratio  is  computed: 

An  = fi(Xl)-£l(X2)---£1(Xn) 
fD(xl)'f0(x2)---f0(Xn) 


1- 


The  procedure  is  then  as  follows:  reject  K„  if  A > A, 
accept  H Q if  An  < B,  and  obtain  another  observation  if 
B < A^  < A.  A and  B are  chosen  so  as  to  make  the  prob- 
abilities of  Type-I  and  Type-II  errors  equal  to  a and  £ 
respectively . 

Exact  values  of  A and  B are  difficult  to  obtain. 
However,  Wald  (1947)  proved  that  for  small  values  of  a and  £ 

A ^ 1 - £ 2nd  B ' 

a 1 - a 

Since  the  hypothesis  about  the  equality  of  K normal 
population  means  with  common  unknown  variance  is  a composite 
multiparameter  hypothesis  with  a nuisance  parameter,  Wald’s 
theory  of  the  sequential  probability  ratio  test  cannot  be 
directly  applied.  To  deal  with  problems  such  as  these, 

Wald  introduced  the  method  of  weight  functions  which, 
through  the  notion  of  a prior  distribution  for  unknown 
parameters,  essentially  reduced  tne  basic  problem  to  test 
hypotheses  in  one  parameter  families.  A difficulty  with 
this  procedure  is  the  choice  of  the  weight  function. 

Cox  (1952)  devised  a unified  method  under  which  sequential 
tests  can  be  obtained  for  composite  hypotheses.  The  basic 
idea  behind  Cox’s  procedure  is  to  consider  a sequence 
formed  by  transforming  the  original  observations,  the 
transformation  chosen  so  that  the  new  sequence  depends 
upon  a single  parameter.  Although  the  distribution  of  the 
transformed  values  {T^}  depends  upon  only  a single  para- 


meter  9,  the  sequence  {t  } may  not  be  independent.  Cox 
gave  conditions  under  which  the  following  factorization  is 


possible 

f (TrT2,...,Tn) 

= f (T  j 0) f (T_ , • • • , T ) 
n z n 

where  f(T2, •••,?„) 

does  not  depend  upon  0. 

When  this 

factorization  is  possible  a sequential  test  can  be  develope 

to  make  a decision  about  this  single  parameter  0/  using 

only  the  transformed  values  Jt  \.  The  test  for  discrim- 

i.nating  between  the  hypotheses 

H„ : 0=9  vs.  H.  : 9=0-, 

0 o 11 

can  now  be  constructed  by  considering  the  following  ratio 


= f(T 

h 

“r‘  f (T 

J ohnson 
one-way  fixed 
experiment  is 
fixed  number 


n 


) 


(1953) 
effects 
carried 
r ^ , for 


taken  from  each  group 
ith  group  at  the  nth 
Let 


applied  Cox’s  method  to  the  following 

analysis  of  variance  problem.  An 

out  in  stages,  and  at  each  stage  a 

i = l,--*,k,  of  observations  are 

Denote  the  jth  observation  on  the 

stace  by  X . . . 

- •*  i i n 


SSB 

n 


n l r (X  -X) 
i=l  1 


and 


k 

- I 


l l 


(X.  . -X. ) 

IIS  1 


ssw 
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and 


x = i y 

K L 


k ni  r 

y y x 

N . L , . L . ' 1 j s 

i=l  3=1  s=l 


N = 


F = 
n 


k 

y 

L 

i=l 


n L r- 


SSB  / (k-1) 


SSW  / (N-k) 


(1.2.1) 


The  distribution  of  the  seouence  \ dcoends  or.lv 

1 nJ 

upon  the  noncentrality  parameter  A.  Applying  Cox's  theorem, 
a sequential  test  for  discriminating  between  the  hypotheses 


Hq  : A = A q vs.  H . : A = A . , A , > A 


(1.2.2) 


for  a given  a and  B is  specified  by  the  decision  rule 


Accept 


H.  < 


Reject  H it 


f(Fn!X0} 

f (F  I A . ) 


f(FJV 


1-a 
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otherwise  continue  to  the  next  stage. 


(1.2.3) 


An  equivalent  test  was  derived  by  Hoel  (1955)  using 
Wald's  method  of  weight  functions.  The  weight  function 
Hoel  employed  was  a generalization  of  that  used  for  Wald's 
sequential  t-test. 


i. 
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The  sane  sequential  test  (i.e.,  the  test  statistic  of 
(1.2.1)  and  decision  rule  (1.2.3)  of  the  hypotheses  (1.2.2! 
has  also  been  by  Hall,  Wijsman  and  Ghosh  (1905) . Theii 
derivation  involved  applying  the  principal  of  invariance . 
They  showed  that  test  statistic  of  equation  (1.2.1)  is 
unchanged  by  any  of  the  following  transformations: 


(i) 

(ii) 


X ' 

ljn 
X ' . . 


i]n 


CX  . 
i;jn 


X.  . 
ljn 


+ 


C 

C 


> 0 


(iii)  an  orthogonal  transformation 


Also,  they  were  able  to  prove  that  the  sequential  test 

was  UMP  for  testing  the  hypotheses  H : A A ^ vs. 

K,  : A il  A..,  by  showing  that  the  density  f(F  'A)  posscsse 

a monotone  likelihood  ratio  (Lehman  (1959)). 

In  addition,  they  proved  that  the  vector  of  statistics 

T = lx,.  X _ , . . . , SSW  1 was  a transitive  sufficient 

n -12  n J 

sequence.  This  finding  is  of  importance  in  later  chapters 
of  the  thesis. 


As  previously  explained,  the  sequential  test  is  came, 
out  in  stages,  where  at  each  stage  a fixed  number  r.,  for 
i = 1,  . . . , k,  of  observations  are  taken  from  each  grout 
Throughout  the  remainder  of  this  thesis  it  will  be  assumed 
that  at  the  first  stage  two  observations  from  each  group  wil 
be  taken  (this  is  so  the  statistic  SSB.  will  not  be  zer  c- 


the  first  stage).  Each  subsequent  stage  will  result  in  one 
observation  from  each  group  being  taken  (i.e.,  r.  = 1 for 
all  i) . All  future  discussions  will  pertain  to  this  parti- 


cular  testing  situation. 


As  in  the  fixed  sample  test,  the  density  of  the  statistic 

F (F  X) , is  that  of  a noncentral  F variate  and  is  given 
n;  n 

in  equation  (1.1.1).  Therefore,  the  decision  rule  of  equa- 
tion (1.2.3)  requires  calculating  the  ratio  of  two  noncentral 
F densities.  For  specified  values  of  a,  B/  X^  and  X,  the 
decision  rule  can  be  reexpressed  as: 

accept 

r-pi  p r*  f 

J 0 


continue  otherwise 


where 


A = R(Fn)  = 


and  M ( x , y , u ) , 
is  defined  as 

M(x,y,u) 


e-I<W 

f 

i 

N-l 

K-l 

A0(K'I,Fn 

M 

2 ' 

z 

1 2 [ K ( r.  - 1 ' + "Vf^T  Fri  ) 

N-l 

K-l 

X 1 (K- 1 ) Fn 

M 

I 

2 ' 

2 1 

' 2^K(n-2)  + (K-l)F  ) 

J 

known  as  the  confluent  hypergeometric  fur.c*  ■ 


r(y)  r(x+t)  ^ 

t=o  r(x)  n y+t)  t; 


Since  the 
statistic,  F 

n 

decision  rule 


above  decision  rule  is  a function  of  the 
, the  equations  may  be  solved  to  obtain  a 
in  terms  of  that  statistic. 


That  is. 


two 


critical  values  o:  the  statistic  may  be  found; 


F 

n 


anc 


F K such  that  R.'F  Ai  = &/(l-a)  and  RfF  R)  = (l-£)/  . 

n v n ' t n ' 

wh<_ n these  critical  values  have  beer,  calculated  for  all 

A R 

stages,  i\  an.  F ^ , n = 2,...,mr;  the  sequential 
tu^t  car.  tnen  be  conducted  by  comparing  the  statistic,  F^, 
o:  equation  (1.2.1)  against  these  critical  values.  Ir. 
sa;.r. ary,  at  every  stage  n the  following  decision  rule 


l s appl leu  : 


accept  H. 

0 

reject  H. 

u 

continue 


..e  test  ia  usually  performed  using  the 


simpler  statistic 

SSB 

V = — 

n SSK, 


somewhat 


The  relationship  between  the  two  statistics  F and 


V is  simply 
n 


(N-K) Vn 

(K-l) 


F 

n 


Conducting  the  test  with  the  statistic  \\  requires  trans- 


forming the 

cr  i 

ticdl  region 

as  well 

(e . g . , 

VA=  (K-l)  F 1 
n n 

)f  (N-K) ) . 

Tables 

of 

the  critical 

values  h 

ave  bee 

n prepared 

for 

selected  va 

lues 

of  a,  p,  K 

, X c and 

X.  by 
1 

Ray  (1956) 

and 

3.K.  Ghosh, 

et 

al.  (1967). 

However , 

these 

tables  are 

in  terms 

of  the  test 

sta 

tistic  G = 
n 

V^/K.  Appendix 

B of  this 

thesis 

port  ains  a computer  program  which  calculates  the  critical  values 
of  V ; V„A  and  V_ R,  for  specified  values  of  a,  S,  K,  X , and  X. 
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As  with  all  statistical  tests,  one  important 
property  oi  the  test  described  above  is  the  Operating 
Characteristic  Curve.  The  OC  curve  for  the  above  test 
is  strictly  a function  of  X,  and  is  given  by 

* * . 

OC(A  ) = Pr  {accepting  H : A = A^.  if  A = A ) 

Wald  developed  an  approximation  for  the  OC  curve 
of  a sequential  probability  ratio  test  of  f(X,Gg) 
against  f(X,G.)  provided  the  equation 

E0{[f(X,0iVf(X,Oo)]h}  = 1 

has  a nonzero  solution  h = h(Q),  and  the  {X,}  are  i.i.d. 
However,  since  the  above  test  is  conducted  on  the  trans- 
formed sequence  { V ^ } which  are  not  independent,  Wald's 
approximation  is  not  valid.  Bhate  (1955)  developed  a 
conjectural  formula,  similar  to  Wald's  approximation  for 
the  OC  curve,  when  the  {x,}  are  not  independent. 

Ghosh  (1970)  suggests  that  substituting  the  sequence 
into  Bhate 's  formula  may  yield  a useful  approximation  to 
the  OC  curve.  The  result  of  this  substitution  yields  the 
following  approximation  to  the  OC  curve. 


If  (A)  is  a nonzero  solution  of  the  equation 


>Vi-l;  Xl}  1 dF(Vi  |V1#  • • • ,VW;  A)  = 


fi (Vi lvi* ‘ ‘ ' 'Vi-l;Xo> 


1 
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I 


and  r.^(A)  = h(A)  for  all  i > 1,  that  is  h ^ ( A ) varies 


very  little  with  1 for  a given  A,  then 


Ah  (A)  . 
e - 1 


OC  ( A)  ^ 


Ah  (A)  Bh ( A ) 
e -e 


Where 


A x in 


B 'v  in 


?he  crucial  point  in  the  use  of  the  conjecture  lies 


in  the  verification  of  h (G)  ^ h(0)  for  various  values 


of  i.  Also  it  must  be  noted  that  this  approximation  is 


only  valid  for  infinite  Wald  regions. 


The  only  other  alternative,  to  date,  for  obtaining 


the  OC  curve  for  this  type  of  test  is  to  employ  Monte 


Carlo  techniques. 


Also  of  interest  in  a sequential  test  is  the  Average 


Sample  Number  function.  For  the  above  test  the  ASN  function 


will  be  defined  as: 


ASN  (A  ) = Expected  number  of  stages  until  a decision 


is  reached  if  A = A . 


As  with  the  OC  curve,  Wald's  approximation  to  the  ASN, 


is  not  valid  due  to  the  dependence  of  the  {V^}  sequence. 


No  general  formula  (exact  or  approximate)  for  the  ASN  for 


composite  hypotheses  exists,  but  Bhate  (1955)  has  developed 
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a conjectural  formula  along  the  same  lines  as  that  for 
the  OC  curve.  Ray  (1956)  has  applied  Bhate's  conjectural 
formula  to  the  one-way  fixed  effect  analysis  of  variance 
test,  and  obtained  expressions  for  X = Xq,X^.  Again, 
as  with  the  OC  curve  this  procedure  is  valid  only  for 
open  Wald  regions. 

Since  the  regions  are  open,  it  is  possible  to 
progress  through  a large  number  of  stages  before  a 
decision  is  reached.  The  number  of  stages  will  always 
be  finite,  however  (Johnson,  1953).  One  way  of  assuming 
termination  within  a reasonable  amount  of  time  is  to 
truncate  the  test.  Truncation  involves  altering  the 
Wald  regions  so  that  by  some  stage  m q a decision  can  be 
made . 

This  thesis  will  be  concerned  with  developing 
procedures  to  obtain  the  ASN  function  and  OC  curve  for 
a SANOVA  test  with  any  given  set  of  truncated  regions. 

The  following  chapter  contains  a derivation  of  SANOVA 
for  the  case  k = 2 by  the  Direct  Method  of  Sequential 
Analysis  (Aroian,  1968). 


1.3  CONCLUSION 

This  chapter  has  served  to  introduce  the  SANOVA 
test.  This  thesis  will  pertain  to  obtaining  the  OC 
and  ASN  functions  of  such  a test.  Currently,  only 
approximations  exist,  such  as  that  of  Bhate  (1959), 
considered  in  this  chapter.  The  next  chapter  will 
derive  the  first  exact  procedure  for  obtaining  the 
OC  and  ASN  of  a k=2  SANOVA  test. 


CHAPTER  3 


APPROXIMATE  METHODS 


3.0  INTRODUCTION 


As  discussed  in  Chapter  I,  the  purpose  of  this 
thesis  is  to  explore  and  develop  procedures  for  obtaining 
the  properties  of  a SANOVA  test,  the  major  properties 
of  interest  being  the  OC  and  ASN  curves. 

Chapter  II  was  concerned  with  an  exact  procedure 
for  the  k=2  SANOVA  test.  The  theory  could  be  extended 
to  the  general  case  of  a k>2  SANOVA  test. 

For  the  general  case,  the  joint  sufficient  statistic 
vector  would  be  of  dimension  k + 1 and  consist  of  the 
following  elements: 


x.  = y x. . 

i . l , in 

n -]  = 1 J 


i = 1 , . . . ,k 


k n 


; = y y x.  ' 

n L . . L . l . 

i=l  3=1  3 


Calculating  the  ASN  and  OC  curves  via  the  direct 

method  would  now  involve  "carrying"  a h+1  dimensional  grid 

of  points  f.  (X,  , X , X,  , S . ) from  stage  to  stage, 

i ii  zi  ki  l 

i = n3'...,mo*  The  density  of  a given  point 
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X,  *,  X X S*  at  stage  i must  be  evaluated 

IJ.  ZX  KX  X 

by  performing  a k dimensional  integration  of  the  2k+l 

P 

dimensional  density,  f^_^(X^,  X2i,’‘",^ki' 

S.,  Z .,...,  Z,  ) , with  respect  to  Z , Z _,...,  Z,. 

1 X K X Z K 

The  region  of  integration  would  depend  upon  the 
point  X . . * , X X,  . * , S.*,  as  well  as  the  regions 

XX  Z 1 rv  1 X 

V 1 1 and  V ^ 1.  For  the  special  case  when  no  decision 

A R 

could  be  made  at  stage  i-1,  (i.e.,  V ^ ^ < 0 and 

A 

V 1 ^ > °°)  , the  integration  region  would  be  that  of 

R 

integrating  around  a hypersphere.  Other  cases  would 
involve  integrating  around  pieces  of  hyperellipses. 

This  process  would  be  repeated  for  all  points  contained 
on  the  grid,  yielding  the  density  ^i'''"' 

xki’  si>- 

To  obtain  the  probabilities  P 1,  P 1 , P 1 would 

ARC 

would  involve  performing  a K+l  dimensional  integration 

of  the  density  f.(X..,  X X,  .,  S.),  the  integration 

1 XI  Z X K1  X 

region  in  each  case  being  that  of  a hyperparaboloid. 

Although  the  direct  method  is  theoretically  possible 
for  k >2,  the  amount  of  calculations  required  to  perform 
the  procedure  is  so  large  as  to  make  it  impractical  with 
the  current  state  of  computer  technology.  An  alternative 


I 

I 
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is  to  use  an  approximate  procedure  for  obtaining  the 
ASM  and  OC  curves.  The  purpose  of  this  chapter  is 
the  discussion  of  a new  multivariate  normal  approximation. 

Since  this  chapter  is  concerned  with  an  approximate 
procedure,  time  should  be  taken  to  discuss  the  current 
most  widely  used  approximate  technique,  that  of 
Monte  Carlo  simulation. 
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3.1  MONTE  CARLO  SIMULATION  METHODS 


Monte  Carlo  simulation  of  a SANOVA  test  involves 
performing  many  statistical  tests  of  the  hypou.  .eses 
and  evaluating  its  overall  performance.  The  data  for 
each  of  the  tests  is  computer  generated.  Since  the 
derivation  of  a SANOVA  test  requires  that  each  observa- 
tion X.  . be  from  a normal  distribution  with  mean  y . 

13  1 

and  standard  deviation  a,  the  computer  generation 
of  data  consists  of  generating  observations  from  each 
of  these  normal  distributions  (Box  and  Muller  (1964)). 

For  a given  state  of  nature  A (i.e.,  P^, 
and  o chosen  such  that 


-.2,2 


(Wi-  p)  /a  )f  a 1 


arge  number , N , 


of  statistical  tests  may  be  simulated.  Each  statistical 
test  consists  of  generating  vectors  of  observations 


Xl(n) Xk(n)  where  Xi(n)  % N(lVo) 


until  the  statistic  indicates  acceptance  or  rejection 

of  the  hypothesis  (i.e. , V <vn  or  V >Vn). 

The  result  of  N such  tests  is  the  following  tally: 

f^1  = relative  frequency  of  accepting  at  stage  i 

f 1 = relative  frequency  of  rejecting  H at  stage  i. 

u 


This  entire  process  is  repeated  for  all  values  of 
desired  X,  yielding  an  approximate  OC  and  ASN  curve. 

An  example  of  statistical  simulation  is  given  in  Hahn 
and  Shapiro  (1969). 

Tables  3 and  4 contain  the  simulation  results  for  the 
two  truncated  SANOVA  tests  shown  in  Tables  1 and  2.  Figures 
15  and  16  compare  the  ASN  and  OC  curves  against  the 
corresponding  fixed  sample  test.  The  fixed  sample  OC 
curves  were  calculated  exactly  by  the  methods  discussed  in 
Appendix  A. 

Because  Monte  Carlo  simulation  involves  random  values, 
the  results  are  subject  to  statistical  fluctuations.  Thus 
any  estimate  of  OC  or  ASN  will  not  be  exact  but  will 
have  an  associated  error  band.  The  larger  the  number  of 
trials  in  the  simulation,  the  more  precise  will  be  the  final 
answer,  and  we  can  obtain  as  small  an  error  as  desired  by 
conducting  sufficient  trials.  Theoretically  as  N -*■  <*>, 
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TABLE  1 

SANOVA  and  ANOVA  Tests  # 1 


HJiLAnU  > O.jO  v S HIlLAHl  > 1,00 


ALPHA  « 0*10  BETA  ■ 0,10 

KLuUIktU  SAMPLE  SIZE  IS  10.0 

UC  FUNCTION  F CJtf  THE  TEST 


L AML)- 

HKOb  OF  ACCEPTING  HO 

y • Ou 

0,9000 

J * 1 1 

0 .7276 

y » 

0.5736 

T 

j 

y • J J 

0.4450 

t 

J . a|  a4 

0.3412 

1 

u • 5o 

0.2594 

0 . 6 f 

0.1958 

l 

0 • To 

0. 14T0 

! 

c 

0.109T 

1 

• 

o 

c 

0.0815 

1 

- 

CRITICAL  value  OF  F » 3.01 

CRITICAL  VALUE  Of_V_i HU167QS- 


3-7 


TABLE  1 
(CONT. ) 


SEQUENTIAL  AN0VA  TEST 
************************************* 


K=  2 MEANS 


HOI  LAMO  » 0 

SEQUENTIAL  TEST 


ACCEPT 

2 

*** 

3 

*** 

4 

*** 

5 

.0149 

6 

.0314 

7 

.0431 

8 

.0517 

9 

.0585 

10 

.06  42 

1 1 

.065 

12 

.07 

13 

.08 

14 

.09 

15 

. I 

VS  HU  LAM1  » 1 


WITH  THESE  REGIONS 


REJECT 


*** 

24.7841 

2.0189 

I .0794 

.7543 

.5912 

.4937 

.4292 

.3835 

.3 

.25 

.2 

.15 

. I 


The  Sequential  Test  has  been  Truncated  at  Step  15 
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TABLE  2 

SANOVA  and  ANOVA  Tests  # 2 


K»2 » 0 GROUPS 

HOILAHO  > 0,00  VS  HIlLAMl  « 1.00_ 

ALPHA  • 0*01  BETA  ■ 0,01 

REOUIREO  SAMPLE  SIZE  IS  27, 0 

OC  FUNCTION  FOR  THE  TEST 


lamda 

PROB  OF  ACCEPTING  HO 

0,00 

0.9900 

0,11 

0.6160 

0.22 

0.5793 

0.33 

0*3673 

0,44 

0.2154 

0.5b 

0.1192 

0.67 

0.0631 

0.7k 

0.0323 

0*09 

0.0160 

1.00 

0.0076 

CRITICAL  VALUC  OF  F ■ 7.15 

CRITICAL  VALUE  OF  V ■ 0.13748" 
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TABLE  2 
(CONT. ) 


SEQUENTIAL  ANOVA  TEST 


K=  2 MEANS 

laho  ■ O • Qu  Vi  LAMi  * l»oo 


A TLST  n I Th  IhLSL  rtlGluNS 


N 

ACtttT 

KtJtCT 

2 

3 

4 

5 

42.7966 

0 

21*900 7 

r 

11.2670 

0 

.5*5505 

V 

2.4422 

lu 

0 • 0u4 V 

0.6062 

1 1 

0*0102 

0*6950 

1 c 

0*0153 

0*6101 

1 3 

0 *0103 

U *5449 

1 4 

0*0249 

0*4940 

15 

0*0293 

0*4535 

1 6 

0*0333 

0*4222 

1 7 

0 * 037u 

0.3960 

1 0 

0*0405 

0*3736 

IV 

0*0430 

0.3549 

20 

0*0460 

0*3365 

21 

0*0496 

0*3244 

22 

0*0523 

0*3120 

23 

0*0540 

0*3010 

24 

0*0571 

0*2912 

25 

0*0593 

0*2624 

26 

0 • 00  1 4 

0*2745 

27 

0*0033 

0.2674 

20 

0*0/00 

0.2500 

2 V 

o*uo5o 

0*1500 

30 

0*  looo 

0*1000 

The  Sequential  Test  has  been  Truncated  at  stage  30 
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TABLE  3 


Simulation  of  SANOVA  Test  # 1 

MONTE  CARLO  SIMULATION  FOR  SANOVA 


SEQUENTIAL  AN OVA  TEST 

***************************** 


K«  2 MEANS 


MO»  LAMO  » 0 VS  HU  LAM  1 ■ 1 


SEQUENTIAL  TEST 


ACCEPT 

2 

*** 

3 

*** 

4 

*** 

5 

.0149 

6 

.0316 

7 

.0431 

8 

.0517 

9 

.0585 

10 

.06  42 

1 1 

.065 

12 

.07 

13 

.08 

14 

.09 

15 

. 1 

01  TH  THESE  REGIONS 


REJECT 


*** 

24.7841 
2.0189 
1 .0794 
.7543 
.5912 
.4937 
.4292 
.3835 
.3 
.25 
.2 
.15 
. I 
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TABLE  3 (Continued) 


CASE  NUMBER  1 


POPULATION  1 MEAN  = 0 SIGMA  = 1 
POPULATION  2 MEAN  = 0 SIGMA  = 1 

SIGMA  OP  SIGMAS  = 0 


LAMBDA  = 0 


SIMULATION  SUMMARY 


N 

PROP  ACC 

PROB  CON 

PROB  REJ 

PROP  TERM 

n 

t- 

0 

1 

0 

0 

3 

0 

.999567 

. 433333F-3 

. 433333F -3 

4 

0 

.988033 

. U5333E-1 

. 1 1 5333F- 1 

5 

.257533 

.718967 

. 1 15333E-1 

.269067 

6 

.208967 

.500167 

. 983333E-2 

.2188 

7 

.130567 

.362167 

. 743333E-2 

.138 

8 

. 868667F- 1 

.270333 

• 496667E-2 

. 91R333E-1 

9 

. 637667E- 1 

.201533 

• 503333E-2 

.0688 

10 

. 463667E- 1 

.151867 

.0033 

• 496667F- 1 

11 

.31 1 333E-1 

.114533 

.0062 

. 373333E-1 

1 2 

. 245333E- 1 

. 825333E- 1 

. 746667E-2 

.032 

13 

.0225 

.514667E-1 

. 856667E-2 

• 3 1 0667E- 1 

1 4 

.0183 

. 2 1 9667E- 1 

.0112 

.0295 

15 

. 100333E-1 

0 

. 119333E-1 

. 219667E-1 

PROBABILITY 

OF  ACCEPTING  HO 

= .900567 

PROBABTl 

TTY 

OF  REJECTING  HO 

= . 994333E 

-1 

AVERAGE 

SAMPLE  NUMBER  = 7. 

46313 

TOTAL  NUMBER  OP  MONTE  CARLO  TRIALS  = 

30000 

r i 
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1 

i 

CAGE  NUMBER  2 

TABLE  3 (Continued) 

ii  ] 

POPIII  ATTON  1 MFAN 

= 0 SIGMA  = 1 

POP'ILATTON  2 MFAN 

= .471405  SIGMA  = 1 

SIGMA  OF  SIGMAS  = 

0 

1 1 

♦ 

♦ 

. LAMBBA  = 

• 

• 

• 

.111111  . 

• 

SIMULATION 

SUMMARY 

7 

N PROB 

ACC  PROB  CON  PROB  REJ 

PROB  TERM 



O 

0 

1 

0 

0 

3 

0 

.999033 

. 966667E-3 

. 96A667E- 3 

4 

0 

.973867 

• 25 1 667F- 1 

. 25 1 667E- 1 

5 

.198433 

.747433 

.028 

.226433 

6 

.1578*7 

.565833 

• 237333E- 1 

.1816 

7 

. 964*67E- 1 

.4494 

. 199667F-1 

. 1 16433 

8 

.0*8 

.362933 

. 184667E-1 

. 864AA7F- 1 

9 

. 48*333F-1 

.297933 

. 1636A7F-1 

.0*5 

10 

. 3*56*7E- 1 

.247933 

. 134333F-1 

.05 

11 

.02* 

. 194967 

• 269667E- 1 

. 529667F - 1 

12 

.0223 

. 148767 

.0239 

.0462 

13 

. 24 1 333E- 1 

• 950333E-1 

.0296 

♦537333E-1 

14 

.019* 

. 420667E-1 

. 333667F- 1 

.529667F -1 

15 

. 134333F-1 

0 

. 28A333E- 1 

. 420667E - 1 

PROBABILITY  OF  ACCEPTING  HO  = .711433 

PROBABILITY  OF  REJECTING  HO  = .2885*7 

AVERAGE  SAMPLE  NUMBER  = 8.1252 


TOTAL  NUMBER  OF  MONTE  CARLO  TRIALS 


30000 
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TABLE  3 (Continued) 


CASE  NUMBER  3 


pnplll  ATTON  1 MEAN  = 
POPULATION  2 MFAN  = 

SIGMA  OF  SIGMAS  = 0 


0 SIGMA  = 1 

.666667  SIGMA  = 1 


LAMBDA  = 


• 4.  «_  4 . *- 


SIMULATION  SUMMARY 

N 

PROP  ACC 

PROP  CON 

PROP  REJ 

PROP  TERM 

n 

c. 

0 

1 

0 

0 

3 

0 

.998733 

. 126667E-2 

. 126667E-2 

4 

0 

.958333 

.0404 

.0404 

5 

.153867 

.761167 

.0433 

. 197167 

6 

. 1 11267 

.609567 

. 403333E- 1 

.1516 

7 

. 68 1 667E-1 

.5073 

.0341 

.102267 

R 

. 483333E- 1 

.4287 

. 302&67E- 1 

.0786 

9 

.0369 

.364267 

. 275333E- 1 

. 644333E- 1 

10 

. 296667E-1 

.310133 

. 244667E- 1 

. 54 1 333F-1 

1 1 

. 204333F- 1 

.246533 

.431 667E- 1 

.0636 

12 

. 1 85333F- I 

,185633 

. 473667E-1 

.0609 

13 

.21 2333E- 1 

.1208 

.0436 

. 648333F- 1 

1 4 

.0201 

.528667E-: 

1 . 478333E-1 

• 679333E- 1 

IS 

.0143 

0 

. 385667F-- 1 

.528667F-1 

PROBABILITY 

OF 

ACCEPTING  HO 

= .5428 

PROBABILITY 

OF 

REJECTING  HO 

= .4572 

AVERAGE  SAMPLE 

NUMBER  = 8.5 

4403 

TOTAL  NUMBER 

OF  MONTE  CARLO 

TRIALS  * 

30000 

3-14 


TABLE  3 (Continued) 


CASE  NUMBER  4 


POPULATION  1 MEAN  = 0 SIGMA  = 1 
POPULATION  2 MEAN  = .816497  SIGMA 


CTGMA  OE  SIGMAS  = 


SIMULATION  SUMMARY 


PROP  ACC 


PROP  CON 


PROP  REJ 


PROP  TERM 


1 

o 

. OOOOOOOF  01 

. 1000000E 

01 

.OOOOOOOE  01 

.OOOOOOOF  01 

3 

. OOOOOOOE  01 

. 9979000E 

00 

. 21 00000E-02 

. 2100000E-02 

1 

4 

.OOOOOOOF.  01 

. 9431 330E 

00 

• 54  76670E-0 1 

. 54  76670E-0 1 

5 

.11 88000E  00 

. 7625330E 

00 

.6180000E-01 

. 1 806000E  00 

w 

6 

. 8440000F -0 1 

. 6225000E 

00 

.5563330E-01 

. 1 400330E  00 

“7 

. 5136670F-01 

.5233670E 

00 

. 4776670F-01 

. 99 1 3330E -0 1 

1 

8 

. 3720000F-01 

. 4425670E 

00 

. 4360000E-01 

. 8080000E -0 1 

1 

9 

. 28 1 OOOOE-0 1 

.3768670E 

00 

. 3760000E-01 

• 6570000E -01 

10 

. 2183330E-01 

. 3204670E 

00 

. 3456670E-01 

. 5640000E -01 

1 

1 1 

. 1490000E-01 

. 2483670E 

00 

.5720000E-01 

. 721 OOOOE -0  J 

1 

12 

. 1 403330E-0 1 

. 1840330E 

00 

• 5030000F -01 

. 64  33330F-0) 

13 

. 1533330E-01 

. 1 153670E 

00 

. 5333330E -01 

. 6866670E -01 

■ 

14 

. 1536670E-01 

. 4673330F- 

01 

.5326670E-01 

.6863330E-01 

1 

15 

. 1100000E-01 

. OOOOOOOE 

01 

. 3573330E-01 

. 4673330E-01 

PROBABILITY  OP  ACCEPTING  HO  = .412333 

PROBABILITY  OF  REJECTING  HO  = .587667 

AVI  RACE  SAMPLE  NUMBER  = 8.58383 


TO  r At  NUMBER  OF  MONTE  CARLO  TRIALS  = 30000 
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TABLE  3 (Continued) 


CASE  NUMPFR  5 


POPUIATION  1 MEAN  = 0 SIGMA  = 1 
F'OF'UL  AT  I ON  2 MEAN  = .94280?  SIGMA  = 1 

SIGMA  OF  SIGMAS  = 0 


LAMBPA  = .444444 


SIMULATION  SUMMARY 


PROB  ACC 

PROB  CON 

PROB  REJ 

PROB  TERM 

n 

• OOOOOOOE  01 

. 1000000F 

01 

.OOOOOOOE  01 

.OOOOOOOE  01 

3 

.000000 OE  01 

. 9980330E 

00 

. 1966670E-02 

. 1966670E-02 

4 

.OOOOOOOE  01 

. 9250670E 

00 

. 7296670E-0 1 

• 7296670E-0 1 

5 

. 9206670F-0 1 

. 754 1 330E 

00 

. 7886670E-01 

• 1 709330E  00 

6 

. 6226670E-01 

. 6237330E 

00 

. 681 3330E-01 

. 1 304000E  00 

7 

. 38 1 3330E-0 1 

. 521 2670E 

00 

. 6433330E-0 1 

. 1 024670E  00 

8 

. 2856670F-01 

. 4370000E 

00 

• 5570000E-0 1 

. 8436670E-0 1 

9 

. 1 926670E— 0 1 

. 3699000E 

00 

. 4783330E-01 

. 67 1 OOOOE -0 1 

10 

. 1550000F-01 

.31 30330E 

00 

. 4 ] 36670E-0 1 

.5686670F- 01 

11 

. 1 1 10000F-01 

. 2360330E 

00 

. 6590000F-01 

. 7700000E-0 1 

12 

. 1013330E-01 

. 1684670E 

00 

• 5743330E- 01 

.6756670E-01 

13 

. 1146670E-01 

. 1 004670E 

00 

. 5653370E - 0 1 

. 6R00O00F-0 1 

14 

. 1 1 23330F-0.1 

• 3876670F- 

01 

. 5046670F- 0 1 

.61 700O0T- 01 

15 

. 8333330E-02 

. OOOOOOOE 

01 

. 3043330F-0 1 

. 3R7*670E  Ol 

PROBABILITY  OF  ACCFF'T  I NG  HO  = .308067 

PROBABILITY  Or  REJECTING  HO  = .691933 

AVERAGE  SAMPLE  NUMBER  = 8.4859 


TOTAL  NUMBER  OF  MONTE  CARLO  TRIALS 


30000 
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TABLE  3 (Continued) 


CASE  NUMBER  4 


POPULATION  1 MEAN  - 0 SIGMA  a | 
POPULATION  2 MEAN  ■ 1.05409  SIGMA  * 1 

SIGMA  0F  SIGMAS  = 0 


LAMBDA  ■ .555556 


SIMULATION  SUMMARY 


PROS  ACC 

PROB  CON 

PROB  REJ 

PROB  TERM 

2 

0 

1 

0 

0 

3 

0 

.994867 

.31 3333E- 2 

.31 3333E- 2 

4 

0 

.907733 

*89 1 333E- 1 

.89 1333E- 1 

5 

.069  3 

.741233 

.0972 

. 1665 

6 

.0475 

.609  1 

•846333E” 1 

.132133 

7 

.026 

.508833 

• 7 42667  E- 1 

. 100267 

8 

. 193667E-  1 

. 4244 

. 6 506  67  E-  1 

. 8 44333E- 1 

9 

.0151 

.352633 

. 566667E- 1 

.7  I7667E- 1 

10 

.01  13 

.29 3867 

• 47  4467  E-  1 

. S87667E- 1 

1 1 

.7  43333E- 2 

.213867 

• 725667E-  1 

.08 

12 

. 623333E- 2 

.147667 

. 599667E-  1 

.0662 

13 

•803333E-2 

. 8 35333E- 1 

• 0561 

.64I333E- 1 

14 

•693333E-2 

.0305 

. 046  1 

.530333E- 1 

15 

. 523333E-2 

0 

•2526 67E- 1 

.0305 

PROBABILITY  OF  ACCEPTING  HO  ■ .222433 

PROBABILITY  OF  REJECTING  HO  ■ .777567 

AVERAGE  SAMPLE  NUMBER  » 8*31024 


TOTAL  NUMBER  OF  MONTE  CARLO  TRIALS  ■ 30000 
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TABLE  3 (Continued) 


CASE  NUMBER  7 


POPULATION  I MEAN  ■ 0 SIGMA  « 1 
P0PULATI0N  2 MEAN  ■ 1.1547  SIGMA 


SIGMA  0F  SIGMAS  = 


SIMULATION  SUMMART 


PR0B  ACC 


PR0B  C0N 


PR0B  REJ 


PR0B  TERM 


• ^ 

m 

2 

0 

1 

0 

0 

3 

0 

.997033 

. 296667  E»  2 

. 29  6667E- 2 

4 

0 

890633 

. 1064 

. 1064 

5 

. 530333E- 1 

.725367 

. 1 12233 

. 165267 

6 

» 356333E- 1 

.'590533 

.0992 

. 134833 

7 

.206333E- t 

. 48 3667 

•862333E- 1 

. 1068  67 

8 

.0139 

.399  267 

.0705 

.08  44 

9 

. 103667E- 1 

.329  567 

. 59  3333E-  1 

.069  7 

10 

.008  4 

.271633 

. 49  5333E- 1 

. 579  333E-  1 

I 1 

.443333E-2 

✓ • 19  1133 

. 760667  E- 1 

.0805 

12 

• 466667E-2 

.127733 

. 587333E-  1 

.0634 

13 

.533333E-2 

. 688333E- 1 

. S35667E-  1 

.0589 

14 

.0052 

.22S333E- 1 

.041  1 

.0463 

15 

. 366667  E- 2 

0 

. I88667E- 1 

. 225333E-  1 

PROBABILITY  0F  ACCEPTING  HO 
PROBABILITY  0F  REJECTING  HO 
AVERAGE  SAMPLE  NUMBER  • 8. 


HO  a . I 6 526  7 
HO  > .834733 

8.09  79  3 


TOTAL  NUMBER  0F  M0NTE  CARLO  TRIALS  ■ 30000 


TABLE  3 (Continued) 


1 

CASE  NUMBER  8 


POPULATION  1 MEAN  = 0 SIGMA  ■ 1 
POPULATI ON  2 MEAN  = 1 .24722  SIGMA  « 1 

SIGMA  OF  SIGMAS  * 0 

j 

L 


LAMBDA  * .777778 


SIMULATI0N  SUMMARY 


N PR0B  ACC  PR0B  C0N  PR0B  REJ  PR0B  TERM 


#■  «. 

2 

0 

1 

0 

0 

3 

0 

.996067 

. 39  3333E*  2 

. 39  3333E- 2 

4 

0 

.872067 

. 124 

. 124 

S 

.38 4667  E- | 

.7012 

. 1324 

. 1 70867 

6 

.263667E- 1 

. 562367 

. 1 12467 

. I 38833 

7 

. 1 53667  E-  1 

.4565 

.09  05 

. 1 05867 

8 

. I 00333E-  1 

.368  533 

.779333E- 1 

.8  79  66  7 E-  1 

9 

.007  4 

.298  4 

.627  333E- 1 

.701 333E- 1 

10 

. 556667  E-2 

.239767 

. 530667E- 1 

. 586333E-  1 

1 1 

. 326667E-2 

. 1 60367 

.761 333E- 1 

. 0794 

12 

.0029 

.1015 

. 559  667E-  1 

. 588667E-  1 

13 

.003 

.525667E-1 

. 4S9333E- 1 

. 489333E-  1 

1 A 

• 336667  E- 2 

. 178667E- 1 

. 3 I 333  3E-  1 

.0347 

IS 

•263333E-2 

0 

. I52333E-  1 

. 1 78  667  E-  1 

PROBABILITY 

0F  ACCEPTING  HO  ■ 

. 1 18367 

PROBABILITY 

OF  REJECTING  HO  ■ 

.881633 

AVERAGE  SAMPLE  NUMBER  - 7.8272 

T0TAL  NUMBER  0F  M0NTE  CARL® 

l 

TRIALS  ■ 

30000 

1 

9 
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TABLE  3 (Continued) 


CASE  NUMBER  9 


POPULATION  1 MEAN  a 0 SIGMA  » | 
P0PULATI0N  2 MEAN  a 1.33333  SIGMA  » 1 

SIGMA  0F  SIGMAS  » 0 


LAMBDA  > .888889 


SIMULATION  SUMMARY 


N 

PR0B  ACC 

PR0B  CON 

PR0B  REJ 

PR0B  TERM 

— 

2 

0 

1 

0 

0 

3 

0 

.995667 

. 433333E- 2 

. 433333E-2 

4 

0 

.8548 

. 1 40867 

. 1 40867 

5 

.0302 

.672267 

. I 52333 

. 182533 

6 

. 189333E- 1 

. 533033 

. 1203 

. 139233 

7 

.01  1 

. 420767 

. 101267 

. 1 12267 

8 

• 79  3333E- 2 

.3301 

• 827333E- 1 

.906667 E* 1 

9 

.0053 

.259  3 

.0655 

.07  08 

10 

• 004  t 

.202267 

. 529333E- 1 

. 570333E- 1 

1 1 

.243333E-2 

. 128  533 

.07  13 

. 737333E- 1 

12 

. 176667E-2 

.779667E- 1 

. 0488 

. 505667  E- 1 

13 

. 1 8 3333E- 2 

.039  3 

. 368333E- 1 

. 38  6667  E-  1 

14 

. 196667E-2 

.012 

. 2 53333E- 1 

• 0273 

IS 

. 1 63333E- 2 

0 

. 103667E- 1 

.012 

PR0BABI L 

ITY 

0F 

ACCEPTING 

HO  ■ 

.0871 

PK0BABIL 

ITY 

0F 

REJECTING 

HO  ■ 

.9  129 

AVERAGE 

SAMPLE 

NUMBER  a 

7.526 

TOTAL  NUMBER  0F  M0NTE  CARLO  TRIALS  a 30000 
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TABLE  3 (Continued) 


CASE  NUMBER  10 


P0PULATI  0N 
P0PULAT  1 0N 


1 

2 


MEAN 

MEAN 


0 SIGMA 
1. A 1421 


■ 1 

SIGMA 


SIGMA  0E  SIGMAS  » 


SIMULATION  SUMMARY 


PR0B  ACC 

PR0B  C0N 

PR0B  REJ 

PK0B  TERM 

2 

. OOOOOOOE  01 

. 1 OOOOOOE 

01 

.OOOOOOOE  01 

.OOOOOOOE  01 

3 

. OOOOOOOE  01 

.99  45330E 

00 

. 546667  OE- 02 

. 5466670E- 02 

4 

.OOOOOOOE  01 

.827  067  0E 

00 

. 1 67  467 OE  00 

. 1674670E  00 

5 

•2403330E- 0 1 

• 6351 67  OE 

00 

. 1 67867  OE  00 

. 19 19000E  00 

6 

. 1 44667  OE- 0 1 

. 49  3567  OE 

00 

.1271 330E  00 

. 1 416000E  00 

7 

.77  6667  OE- 02 

. 38 1267  OE 

00 

. 1045330E  00 

. 1 123000E  00 

8 

. 4633330E- 02 

.29  49  3 30E 

00 

.8  I 70000E-0I 

. 8633330E- 01 

9 

• 4300000E- 02 

. 2263330E 

00 

. 6430000E-0 1 

• 68  60000E - 0 1 

10 

. 3 1 OOOOOE- 02 

. 1 704330E 

00 

. 528  OOOOE- 0 1 

. 559 OOOOE- 01 

1 1 

. 1 06667  OE- 02 

. 1048000E 

00 

• 6 45667  OE- 0 1 

. 6 563330E- 0 1 

12 

. 1 333330E- 02 

.61 50000E- 

01 

• 4 1 9 667  OE- 0 1 

. 4330000E- 0 1 

13 

. 1266670E-02 

.2843330E- 

01 

. 3 1 8 OOOOE- 0 1 

. 3306670E-01 

1 4 

. 1 56667  OE- 02 

• 8 633330E' 

02 

. 1 823330E- 0 1 

. 19  8 OOOOE- 0 1 

15 

.8333330E-03 

.OOOOOOOE 

01 

. 7800000E- 02 

. 8 6 333  30E- 02 

PROBABILITY  Of  ACCEPTING 
PROBABILITY  0f  REJECTING 
AVERAGE  SAMPLE  NUMBER  ■ 


HO  ■ • 6 436  67  E- I 

HO  » .935633 

7.22667 


TOTAL  NUMBER  0f  MONTE  CARLO  TRIALS  - 30000 
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TABLE  3 (Contin, ad) 

SIMULATION  SUMMARY 

Test  #1 


X 

Probability  of  Accepting  HQ 

Average  Sample  # 

0.00 

0.9006 

7.4631 

o.ll 

0.7114 

8. 1252 

0.  22 

0. 5428 

8. 5440 

0.33 

0.4123 

8. 5838 

0.44 

0.  3081 

8. 4859 

0.  56 

0.2224 

8. 3102 

0.  67 

0.1653 

8.0979 

0. 78 

0. 1184 

7.8272 

0.  89 

0.0871 

7.5260 

1 . 00 

0.0644 

7. 2267 
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TABLE  4 


I 


" Simulation  of  SANOVA  Test  # 2 

i 


munTl  CARlO  SIMULATION  T UR  SANUvA 


SluuInT  1 -Jtil  yf 

L AML)  » U.Ou  VS  LAMI  * 1.00 


m TtbT  hITN  lhtSt  KtOiONS 


K 


r 


i 

i 

i 

i 


N 

ACLtH 

K t JLC T 

2 

*** 

* * * * * 

3 

• * * 

* * * * * 

4 

s 

* * * 

iiZ 

,7yc>6  ..  _ . _ - 

6 

21 

.96  07 

7 

1 1 

. 2o70 

0 

b 

.bbOb 

y 

******** 

2 

• AA22 

lu 

o • ou Ay 

0 

.6062 

LI 

y.0102 

. _y 

*69b0  . . - 

12 

o • o i b 3 

0 

.6101 

1 3 

0 *0203 

0 

.bAAy 

1 A 

o • o 2 a y 

0 

, AVAO 

lb 

U • 0 2 V 3 

0 

. Ab3b 

16 

o • o j33 

0 

.4222 

1 7 

Uj  V 3 / U __ 

. ..  _U 

*3*60  ..  

1 e 

u • u AOb 

0 

. 3736 

IV 

U • 0 A 30 

0 

*3bAy 

20 

0 * 0 A60 

0 

. 3366 

21 

u • OAyo 

0 

. 324  A 

22 

0*0323 

0 

.3120 

2j 

o • obAo 

A) 

1 30 1 0 ...  

2 a 

U • 03  7 1 

0 

. 2 V 1 2 

2b 

0 • ObV J 

0 

.2624 

26 

0 • 06  1 A 

0 

. 274b 

27 

0*0633 

0 

.267  4 

2b 

0 • 0 ( 00 

0 

. 2b0o 

20 

u . oobo 

y 

* I boo  . . . 

30 

0 • 1 o 0 0 

0 

.1000 

1 


TABLE  4 (Continued) 


SIMULATION  SUMMARY  OF  SANOVA  TEST  #2 


Probability  of  Accepting  Average  Sample  # 


0 . 00 


0.  11 


0.  22 


0.  33 


0. 56 


0. 78 


0 . 89 


1.00 


0.9826 


0. 7490 


0.4912 


0.2897 


0 . 1660 


0 . 0887 


0.0449 


0.0241 


0.0122 


0.0063 


14  . 3163 


18.9898 


21 . 2247 


21.6097 


20. 8568 


19 . 5344 


18. 1556 


16 . 7795 


15.5924 


14.6108 
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and 


so  that 


mr 


m„ 


r P r i 

l fft1 ” l P ' 

i=l  i=l 


In  practice,  the  allowable  error  is  generally  specified, 

and  this  information  can  be  used  to  determine  the  required 

number  of  trials,  N . The  following  expression,  based  on 

R 

the  normal  approximation  to  the  binomial  distribution, 
may  be  used  as  a rouyh  estimate  of 

(.25)  2 

Nd  ~ t—  Z7  /0 

R e2  l-a/2 

where  E represents  the  allowable  error  and  z±-a/2 
designates  the  (l-a/2)  100  percent  point  of  a standard 

normal  distribution. 

Recently  the  methods  of  Monte  Carlo  importance 
sampling  have  been  applied  to  simulations  of  sequential 
tests  (Siegmund  (1976)).  This  more  sophisticated  type 
of  simulation  requires  smaller  N for  a given  degree 


of 


accuracy . 
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3.2  MULTIVARIATE  NORMAL  APPROXIMATION 


I 


As  previously  discussed,  the  direct  method 
"carries"  the  joint  density  of  the  sufficient  statistic, 
f (X,  , X.,  , . . . , X,  . , S . ) from  stage  to  stage. 

lll2l  Kll 

At  every  stage  this  density  in  integrated  to  obtain 
the  probabilities  of  acceptance,  rejection,  and 
continuation. 

Consider  the  probability  P 1 . This  quantity 
represents  the  probability  of  accepting  at 

stage  i.  An  axiom  of  sequential  analysis  requires 
that  a decision  to  accept  could  not  be  made 

at  stage  i,  unless  all  previous  stages  resulted  in  the 
decision  to  continue.  Thus,  this  probability  is  a 
joint  probability. 

Since  all  decisions  are  based  upon  the  statistic 
V^,  the  probability  Pft1  is  dependent  upon  the  joint 
distribution  of  the  statistics  V^,  V^,...,V^. 

Therefore,  this  probability  is  given  by  the  following 
expression : 


(V'1  < \ < V'1)^  i O} 
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One  method  of  calculating  this  probability  is  to  integrate 
the  joint  density  f(V2<  V , . . .V  ) , or 


VaVV1 


1 - f 7 • • • f 

J n Jv  i"1  Jv 


f (V_,  V,  , . . . , V . ) dV_  dV  . . .dv. 


r0  -'V  * x -'V  ‘ 
A A 


2'  3 


2 3 i 


(3.2.1) 


Similar  expressions  exist  for  P 1 and  P 1; 

R C 


oo  V 1-1  V 2 
• /•  R /-  R 


Tf  X7-*  A X7 


f (V  V-,...,V.)  dV  dv - . . . dV . 
2 3 1 2 3 l 


V V V 

R A A 

l i-l  2 

/-VK/-VR  rVi 


(3.2.2) 


> . r y--  ■ f 

•A/1 -A  1-1  -A; 


C " yv 7V2  f (V2,  V3 V.)  dV2  dV3...dV 

A A A 


(3.2.3) 


One  should  also  note  that  the  identities  of  sequential 
analysis  are  still  valid.  For  example. 


oo  v i_1  V 2 


■»* + ^ * pc‘  - J f \:;f 


0 V * * V ‘ 

A A 


f(V_,  V.,...,V.  dV  dV_...dV. 
2 3'  l 2 3 l 


= P 


For  any  stage  n,  the  statistic  V , where 

n 


n £ (*■ 

i=iv  i(n 


) “ X(n)/  /o' 


k n 


* * <xij  - xi(„)>  / 

i=l  j=l 


Vo2 
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has  a distribution  dependent  upon  X.  The  numerator  T 


is  distributed  as  a noncentral  x variate  with  non- 
centrality parameter  nA  and  K-l  degrees  of  freedom. 

2 

D , the  denominator,  is  distributed  as  a central  X 

n 

variate  with  K(n-l)  degrees  of  freedom.  Thus  the 

statistic  V is  distributed  as  a constant  times  a 
n 

noncentral  F variate  with  noncentrality  parameter  nX 
and  degrees  freedom  K-l,  K(n-l);  i.e.: 


Tn  " X (nA,k-l 


Dn  % X K(n-l) 


V 

n 


F (nX) 


K(n-l) 


K- 1 , K ( n- 1 ) 


The  statistic  at  stage  m (m  > n)  V , is  correlated 

m 

with  the  statistic  at  stage  n;  V , since  T and  T 

n n m 

are  correlated  as  are  D and  D . As  shown  in 

n m 


Appendix  D 


2 (K-l) n 


cov 

(v 

T ) 
m 

m 

+ 4nX 

(3.2.4  ) 

cov 

(v 

Dm> 

= 2K (n-1) 

(3.2.5  ) 

cov 

(v 

V 

= cov  (T  , 
m 

n 

~E 

Q 

0 

cov 

(v 

Dm) 

= cov  (Tm, 

V = 

0 

• 
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Therefore  the  joint  distribution  f (V  , V ) is  a type 
of  bivariate  noncentral  F related  distribution. 

Derivation  of  an  explicit  expression  for  the  joint 
distribution  is  difficult. 

The  multivariate  joint  distribution 
f ( V ^ , V^,...,V^)  is  a new  type  of  multivariate 
noncentral  inverted  Dirichlet  distribution  (Johnson 
and  Kotz  (1972)).  Expressions  for  the  joint  density 
function  or  joint  characteristic  function  have  not 
been  found  at  this  time.  However,  it  is  certain  that 
the  expressions  will  be  very  complicated  making  the 
evaluation  of  the  integrals  given  in  equations  (3.2.1  ) - 
(3.2.3  ) impractical.  This  presents  a major  roadblock 
to  obtaining  answers  via  this  procedure.  The  best 
one  could  currently  hope  for  would  be  an  approximation 
to  the  density  f (V^,  V^#...,^).  Several  approaches 
to  approximating  the  density  f (V^/.-./V^)  are 
avai lable . 

One  approach  is  the  construction  of  a multivariate 
Gram-Char lier  or  Edgeworth  series  expansion.  Gulberg  (1920) 
and  Meinner  (1934)  have  given  explicit  formulas  for  a 
Gram-Char lier  expansion  of  an  m dimensional  density 
function.  In  terms  of  the  standardized  variables, 


3-31 


uo  ou  oo 


1 7 ' X ) - £ £ [ C-  » j 2 * - • * ' j s 


j2=0  j3=0  3i=0  2 


a j2^j3+- * -+ji 

ax,  ^2  ax.^3. . .ax.^1 

2 3 i 


Zm  (x,  0,  R) 


where  Z (X,  O,  R)  is  a standardized  m dimensional 
in 

normal  density  with  correlation  matrix  R.  The 

coefficients  C ......  are  calculated  from  the  mixed 

J 2 3 i 

moments  of  the  original  distribution.  Since  the  mixed 
moments  can  be  derived  by  the  methods  given  in 
Appendix  D this  approach  is  possible  but  impractical 
for  large  values  of  m. 

Chambers  (1967)  has  given  an  algorithm  for  the 
construction  of  Edgeworth- type  expansions  for  a general 
m-variate  distribution;  but  the  technique  requires  the 
joint  characteristic  function,  which  has  not  yet  been 
obtained  for  f (V  , ...,V^). 

Another  approach  is  to  transform  the  original 
statistics  into  a new  set  , where 

T ^ = g (V2,...,V^),  such  that  the  joint  distribution 

f (T. ,TJ  may  be  approximated  by  another  known 

multivariate  distribution.  The  procedure  usually  involves 
finding  a simple  univariate  transformation  which  will 
approximately  transform  each  of  the  marginal  distributions 
into  a known  univariate  distribution,  and  then  constructing 
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a multivariate  distribution  with  such  marginal  distri- 
butions. The  most  common  transformation  chosen  is  one 
which  transforms  to  a normal  distribution.  This 
approach  will  now  be  considered. 

First,  note  that  since  the  distribution  of  the 

statistic  V is  related  to  an  F distribution,  the 
n 

SANOVA  test  may  be  conducted  using  the  statistic 


F 

n 


/K(n-l) 
\ K-l 


V 

n 


and  regions 


K ( i - 1 ) '1 


K 


(3.2.6) 


The  remainder  of  this  discussion  will  consider  the 

sequential  test  in  terms  of  the  statistic  F^. 

As  previously  discussed  the  distribution  of  the 

statistic  F is  that  of  a noncentral  F variate  with 
n 

noncentrality  parameter  nX  and  degrees  of  freedom 
K-l,  K(n-l).  If  X = 0,  this  distribution  becomes  a 
central  F variate.  Consider  first,  possible  trans- 
formations for  the  case  X = 0. 
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Several  transformations  have  been  suggested  for 

approximately  normalizing  a central  F variate. 

Aroian  (1941)  concluded  that  an  excellent  approximation 

was  that  suggested  by  Paulson  (1942).  The  technique  is 

based  on  the  Wilson-Hilf erty  (1931)  approximation  to 

2 

the  distribution  of  X . 

The  Wi lson-Hi If er ty  transformation  is  an  approximate 

2 

normalizing  transformation  of  a X variate.  If  the 

2 

distribution  of  X is  X with  v degrees  of  freedom, 
then  the  quantity  (X/v)3'/3  is  approximately  normally 
distributed  with  mean  l-(2/9v)  and  variance  (2/9v)» 


(X/v)"3  * Nil  - — , J — 

V 9v  f 9v 


(3.2.7  ) 


If  the  distributions  of  T and  D are  each 

n n 

approximated  in  the  way,  then  the  distribution  of  " 

is  approximated  by  the  distribution  of  the  ratio  of  two 


independent  normal  variates.  In  fact,  Ff 
mately  distributed  as 


is  approxi- 


(3.2.8  ) 


1_  I 

; * 

i 

i!  i 

i 

i 

! I 
I 


I 

I 

I 

I 

I 
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where  U^,  U^  are  independent  normal  variates  and 

v = K- 1 
v,  = K(n-l)  . 

The  distribution  of  the  ratio  of  two  independent  normal 

variates  may  be  approximated  by  a method  suggested  by 

Geary  ( 19 3C) • This  method  involves  the  following 

approximation:  if  X^  and  are  independent  normal 

2 

random  variates  and  E[X_^)  = £,  j , var  (X  ^ ) = 0j  , (j  = 1/2) 
with  i ^ >>>  o 2 then  the  distribution  of 


(Bf.2  - «,) 


where  R = X^/X^ 

is  approximately  standard  normally  distributed. 

Using  this  additional  approximation  for  the 
distribution  of  the  ratio  in  ( 3.2.6)  we  are  led  to  the 
approximation  of  taking 


(3.2.9) . 


to  have  a unit  normal  distribution. 
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Each  of  the  statistics,  F ^ , having  marginal  F 

distributions,  may  be  transformed  to  the  approximately 

normally  distributed  statistics.  Thus,  the 

original  SANOVA  test,  using  the  statistic  F^  and 

regions  F 1 , F 1 , is  approximated  by  a sequential  test 
A R 

using  the  statistic  Z.  and  regions  Z 1 , Z_a- 

1 J A R 

The  probabilities  P „ 1 , P 1 and  P 1 are  then 

A R C 

approximated  by  the  appropriate  integration  of  the 
joint  density  f (Zy...,Z.). 

Although  each  of  the  Z ^ ' s i-s  approximately 
normally  distributed,  it  is  not  necessary  that  their 
joint  distribution  be  approximately  multinormal.  An 
example  of  this  has  been  constructed  by  Pierce  and 
Dykstra  (1969).  In  particular,  they  show  that  if 


(3.2.10). 

Then,  although  each  subset  of  X^,...,Xm  has  a joint 
multinormal  distribution  with  variance-covariance 
matrix  I,  the  complete  set  is  not  multinormally 
distributed . 

Although  examples  like  this  can  be  constructed, 
many  distributions  are  such  that  both  the  marginal  and 


r— ^ 

I 

I 
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joint  distributions  are  normal.  If  in  fact 
f (Z  ,...,Z^)  is  not  multinormal,  assuming  it  is 
amounts  to  yet  another  approximation. 

In  summary,  the  multivariate  normal  procedure 
(when  A = 0)  approximates  the  SANOVA  test  by  a test 
employing  the  Zn  statistic  and  regions  Z S Z R 1 , 
with  the  joint  distribution  f (Z2,...,Z^)  being 
multinormal . 

The  joint  density  f (Z2#...,Zn)  is  given  by: 

f (Z2 Zn)  = f(Z>  = II  |"1/2(2n) “m/2  exp(-l/2Zl_1Z) 

(3.2.11) 

where  m is  the  dimension  (i.e.,  n-1)  and  £ is  the 
correlation  matrix;  i.e.,  ^ = corr  (Z^ , Z^). 

In  order  to  specify  the  density,  the  elements  of 

the  correlation  matrix  J need  be  determined.  To  obtain 

the  elements  of  this  matrix  exactly  would  require  a 

quadrivariate  integration  of  the  joint  density  of 

T , D , T , D . This  integration  takes  the  following 
n'  n m m ^ 

form : 


E <z  ' z > 
n m 


Z Z f (T  , D , T , D ) dT  dD  dT  dD 
n m n n m m'  n n m m 


(3.2.12) 


where  m > n.  (Note,  the  limits  of  integration  for  may 


be  justified  by  referring  to  Appendix  D .) 
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Since  T and  D and  T and  D are 
n n m m 

independent,  the  joint  density  f (Tfi , D , Tm,  D^)  is 
the  product  of  two  joint  densities;  i.e. 


f (T 


m 


D) 

m 


= f (T 


T ) 
m 


f (D 


n 


m 


2 

The  density  f (D  , D ) is  a type  of  bivariate  X / 

n m 

and  is  given  in  Appendix  D.  Its  form  is  relatively  simple. 

Kibble  (1941)  obtained  the  moment  generating  function 
for  a bivariate  gamma  distribution  related  to  f (T^ , Tm) • 

He  obtained  a Laguerre  polynomial  series  expansion  for 
the  density. 

Thus  the  density  f (Tn,  D , T^,  Dm)  can  be  expressed 
in  a series  form.  Since  this  density  is  quite  complicated, 
the  integrations  required  to  evaluate  p (Z^,  Z^)  can  not 
be  evaluated  analytically.  However,  it  can  be  evaluated 
numerically . 

Since  the  Wilson-Hilferty  and  Geary  transformations 
are  both  approximations, 


i = p 

z z u 

n m 


= a. 


m 


and  equation  (3.2.12)  is  only  an  approximation  to  the 


p (Zn,  zm).  Obtaining  exact  values  for  the 


correlation , 
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I 

I 

I 


! 

1 

! 


1 


correlations  would  involve  finding  exact  values  for 
U , u„  , o , o as  well.  Since  the  density 

Lt  u Lt  lj 

n m n m 

f (t  d t D ) is  known  when  A = 0,  it  is  possible 

n n m m 

to  obtain  all  these  quantities  numerically.  Thus,  the 
distribution  of  Z would  become  a general  multivariate 

normal  with  mean  vector  and  variance-covariance 

matrix  V.  This  procedure  was  concluded  to  be  impractical 
for  two  reasons. 

The  first  involves  examining  the  accuracy  of  the 
Wilson-Hilferty  approximation.  Kendall  and  Stuart  (1958) 

2 1/3 

show  that  (X  /v)  converges  to  mean  i - (2/9v)  and 
variance  (2/9v)  at  a faster  rate  than  the  rate  at  which 
the  distribution  approaches  normality.  Thus  using  an  approxi- 
mation to  p(Zn,  Zm) , to  the  same  degree  of  accuracy  as  that 

of  v and  c„  , should  have  little  effect  on  the  accuracy 
n n 

of  the  overall  approximation. 

Second,  extending  the  procedure  for  A ^ 0 would 
require  a much  greater  amount  of  computation.  For  A t 0, 
it  would  be  desirable  to  find  a different  transformation 

z = (f  , A,  u , c ) 
n n n F F 

n n 

such  that  Zn  is  approximately  normally  distributed. 

The  correlations,  P^zn'  zm) > (which  are  in  actuality  the 

quantities  E [Z  Z ])  could  still  be  obtained  as  in 
1 n m 


. 


equation  (3.2.12).  However,  when  X ^ 0,  the  joint  density 


f (T  , T ) is  not  known  explicitly,  rather  it  must  be 
n m 

obtained  by  inverting  the  joint  characteristic  function, 

<p (t^,tn).  Thus  to  obtain  the  correlations  would 

n ' m 

require  evaluating  the  following  six  dimensional  integration; 


P <z  ' ZJ 
n m 


Z f (D  ,D  ) f (T  ,T  ) dT  dD  dT  dD 
m n m n m n n m m 


where 


f (T 


m 


^T  ,T  (tl't2)  dtl  dt2 

n m 


(3.2.13) 


and  , (t  ,t  ) niay  be  obtained  from  the  derivations  of  Appendix  D 

T T 12 
n m 

Since  this  integration  could  not  be  done  analytically 

it  would  have  to  be  done  numerically.  To  specify  all  the 

correlations  in  the  correlation  matrix  for  a SANOVA  test 

with  m*  stages  at  which  a decision  could  be  made,  would 

2 

require  evaluating  1/2  (m*  -m* ) of  the  above  integrals 
numerically. 

Since  exact  calculation  of  the  correlations  was 
concluded  to  be  impractical,  in  approximate  procedure 
would  be  employed. 

One  method  of  approximation  would  involve  assuming 
the  Wi  lson-Hilf  erty 


was  exactly  normally 


distributed.  Thus  the  quantities: 


(T  /v,  ) 
n I 

n 


1/3 


(Dn/v2 


1/3 


(T  / . ) 


1/3 


(D  / ) 

m 2 

m 


1/3 


(3.2.14) 


where 


V. 

l 


= K-l 


V _ = K(i-l) 


would  all  be  assumed  to  be  normally  distributed.  The 

correlations  p (Z  ,Z  ) could  then  be  calculated  as: 

n m 


» <V  V ’ ////  Zn  Z,n  f ,Xr  V X3'  V ■“!  <“2  *3  4 


Since  X ^ and  X^  and  and  X4  are  correlated, 

the  joint  distribution  f (X^,  X^,  X^»  X^)  cannot  be 
constructed  as  the  product  of  the  corresponding  marginal 
distributions.  Although  each  of  the  marginal  distributions 
is  normal;  obviously  the  joint  distribution  is  not  multi- 
normal (since  D > D ) . 

m — n 
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Constructing  the  density  f (X^,  X^)  is  not  an  easy 

task,  and  certainly  its  form  can  only  complicate  the 

integrations  required  to  calculate  p(Z  , Z ). 

n m 

From  a computational  standpoint  this  approach  does  not 
appear  to  be  a productive  one  to  pursue. 

This  approach  could  be  simplified  by  making  the 
additional  approximation  that  f (X^,  X^,  X^,  X^)  is  a 
quadr ivar iate  normal.  This  would  require  calculating  the 
covariances,  cov  (X^,  X^)  and  cov  (X^,  X^),  to  completely 
specify  the  variance  covariance  matrix  of  the  quadr ivar iate 


3-  <12 


. 


} 


I 

I 

I 

I 

I 

I 


I 

I 

i 

I 

I 

1 

I 


normal  distribution  (since  all  the  variances  are  assumed 
to  be  of  the  form  1 - (2/9v)  and  cov  (X^,  X^)  = 


cov 

(V  V 

= cov  (X1#  X4 

) = cov  ( X ^ f X 

3}  = 

0)  . 

These  covariances  are  given  by  the 

following 

expressions : 

cov 

(xrx3)  = 

(v  V )-1/3  E 
In  lm 

"T  1/3t  1/3' 

n m 

- E 

T 1/3' 
n 

E 

[t  1/31 
m J 

cov 

(X2,x4)  = 

(v  v )“1/3  E 
2n  2m 

d 1/3d  vs] 
n m J 

- E 

0 1/3i 
n ] 

E 

[d  1/3‘ 

m 

(3.2.15) 


It  should  be  noted  that  constructing  the  variance- 
covariance  matrix  in  this  manner  does  not  guarantee  that 
the  matrix  will  be  positive  definite.  This  arises  from 
the  fact  that  the  two  covariances  are  calculated  from  a 
joint  density  where  X^  X^;  yet  the  multivariate 

normal  distribution  does  not  have  this  restriction.  If 
the  matrix  was  not  positive  definite,  approximating 
f (X^,  X2,  X3,  X4)  by  a quadr ivar iate  normal  would  not 
be  appropriate. 

As  shown  in  the  above  equations,  actual  calculation 

of  cov  (X 3 , X ) , cov  (x  , X4)  would  require  expressions 

for  mixed  fractional  moments  of  T , T and  D , D . 

n m n m 

Since  the  joint  density  f (D  , D^)  is  explicitly  known, 

R S 

expressions  can  be  found  for  E [D  D„  ].  However,  as 
r n m 

previously  discussed,  for  A ? 0 similar  expressions 

for  T , T could  be  obtained  only  through  inverting 
n'  m 

the  joint  characteristic  function. 


3 


In  conclusion,  the  approximate  procedure  seems  to 
offer  no  advantages  over  the  exact  method. 

Another  type  of  approximation  often  yielding  useful 
results  is  that  of  statistical  error  propagation 
(Hahn  and  Shapiro  (1967)). 

Consider  a complicated  function  of  the  random  variates 
Xr  X2,...,Xn,  say 

W * g (Xx,  X2> . . . ,Xn)  . 

In  situations  where  exact  calculations  of  the  moments 
W are  impractical,  the  method  of  statistical  error  prop- 
agation may  yield  useful  approximations. 

The  method  consists  of  expanding  g (X^,  X2,...,X  ) 
about  [E  (X ^ ) , E (X  ) , . . . ,E (X^) ] , the  point  at  which 
each  of  the  variables  takes  on  its  expected  value,  by  a 
multivariate  Taylor  series.  An  equation  for  the  expected 
value  of  W is  then  obtained  by  taking  expected  values 
in  the  resulting  expression  and  applying  some  simple  algebra 
Higher  order  moments  are  obtained  in  a similar  manner. 

The  technique  may  also  be  used  to  approximate  the 
covariance  of  two  functions:  g (X^,  X2»...,X  ) and 
h (X ^ , X 2 , . . . , X^ ) . The  covariance  is  given  by  the 
following  expression: 


3-44 

cov  (gfX^X.,, . . . ,Xn)  , h(X1,X2» . . . ,Xn)  ) 

= E q(XrX2,...,Xn)  - E g(X1,X2,...,Xn)  * E MX^X^  . . . ,Xr) 

where 

q (X1,x2,...,Xn)  = g(X1,X2,...,Xn)  • h(X1,X2,...,Xn)  . 

To  find  the  series  approximation  to  the  covariance  involves; 
developing  Taylor  series  expansions  for  g(X^,...,X  ), 
h (X 1 , . . . , Xn) , and  q(X^, . . . ,Xn) ; taking  expected  values 
of  each  of  these  expressions;  and  finally  subtracting  the 
product  of  the  first  two  expected  value  expressions  from 
the  third.  The  result  of  this  yields  the  following 
expression  for  the  covariance: 


cov  (g  (X^ , . . . , Xn) , h (X1 , . . . ,Xn) ) 
j » °°  /9*q\  E [ (X—p  ) 1] 

* j l"’  I ) 

i,=0  i =0  ' 3X. / i • 1,1 ***i  ! 
'1  n l 1 2 n 


a*h\/3*gv  E[(X-u  )1]  E [(X-Uv)j] 


°°  °°  °°  “ / dwn\ /a*g\  E 1 IX- M ) J E l (X-M  ) 

I •••  I I •••  I ( — )( — ) 

ii=°  in=o  j,=o  jn=o  VaxV^xV  y-y  y-y 


where 


E[‘X-“x>i]  - E[(Xl-“x1»l1  (X2-Vl2---(VJVln] 


and 

3*  q 

i +i  + +i 

3 1 2 n 

3Xa 

ax  11  ax-11** 

1 2 

x = u Y 

n X 


(3.2.16) 
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This  is  an  exact  expression  but  in  practice  only  a finite 
number  of  terms  are  used,  thus  yielding  an  approximation. 
A Kth  order  approximation  consists  of  retaining  only 
those  terms  whose  powers  of  the  expected  value  sum  to  K 
or  less.  For  example,  a second  order  approximation  is 
of  the  following  form: 


(g(X1,...Xn)  , h (X 1 Xn)  ) 

n n / 9 * 1.  \ / 9*g  \ 

I l (—  — cc 

1=1  j=i  Vax^^  / \ ax  . / 


cov  (X^,X.) 


(3.2.17) 


This  method  can  be  used  to  develop  approximate 

formulas  for  the  covariance  of  Z and  Z . Several 

n m 

such  approximations  can  be  examined. 


One  approach  consists  of  letting 
( / 2 \ . ..  / 2 


7 — 

h 

(f„>  ■ j 

l/x. 

2 > 

\F1/3  -h 

L — 

n 

n 

n 

! V 

9K  (n-1)/ 

|Fn  l1 

and 

n — 

h 

<*■„>  - 1 
m j 

'A  _ 

2 > 

If1/3  -ll 

L 

m 

1 1 
m 

!V 

9K  (m-l)> 

lF m \ 

f2/3+ 


9(K-1)/J  \9K(n-l) 


9(K-l)/>  \9K(m-l)  m 9 (K-l) 


'(  3. 2. 18) 


Then  a second  order  approximation  to  cov  (Z  ,Z  ) 

n m 


is  given  by: 


cov  ( Z , Z ) 
n m 


/^*h^\  /d*h  \ 


COV  (F  ,F  ) 
n m 
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The  necessary  partial  derivatives  are  of  the  following 
form : 


d*h 


dF  . 


where  v,  = K-l  and  v.  = K(i-l) 

1 1 

and 

> vm\  ( nv,  (v  -2)  + v.^m  ] 
cov  (F  , F ) = / _HJM  J _L_JB i { 


n m 


m(vn-2)(vln-4)(vm-2) 


(3.2.20) 


I 

I 

I 

J 

I 

I 


One  difficulty  with  this  approach  is  the  problems 
encountered  when  higher  order  terms  are  added  to  the 
approximation.  These  higher  order  terms  involve  express.' on 
for  the  higher  order  moments;  i.e.r 

E[(Fn~ijR  )K(F^-uF  )S],  which  do  not  exist  for  all  n 
n r m 

and  m.  Thus,  a higher  order  approximation  could  only 
be  used  for  a SANOVA  test  for  which  all  the  moments 
could  be  calculated. 


j 
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An  alternative  approximation  to  cov  (Z  ,Z  ) 

n m 

can  be  developed  by  considering  Z and  Z to  be  the 

n m 

f. llowing  bivariate  functions: 


[P2X1 

yxx2_ 

h^l2 

2 2- 
+01  X2  j 

1/2 

[-4X3  - 

M3X4. 

r 2 2 

[°4  X 3 

^ 2V  21 
+ 03  X 4 j 

1/2 

(3.2.20) 


'n  h (>. ^ , X^) 


Zm  * 9<W 


with  X being  defined  in  (3.2.14)  and 


v.  . E|X. 


0 • = var (X 

l i 


Since  cov(X^,X2)  = cov{X-^,X^)  = 0,  the  second  order 

Taylor  series  approximation  to  cov (Z  ,Z  ) becomes 

n m 


COV  (Zn'V  © © C°V  (X1'X3>  + © © COV  (X2'V 

l 2 2 4 

(3.2.21  ) . 


where 


9 * h 
9X. 


9h 
9 X . 


y 

y 


cov  (xrx3)  = covfiYy1^^^1^  {^^'X/2cav^nl/\1/3) 
cov  = cov((Dn/v2n)l/3f(Dm/V2rti)l/3)=  (v^v^Tl/3  covt^1/3^!/3} 


I 

I 


and 
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F 


1 

yl 

= U3 

- 1 - (2/9v^), 

2 

°1  = 

°32  85  2/9vln 

1 

yl 

» 1 - 

(2/9^^  ) 

°22  * 

2/9*2n 

1 

W4 

« 1 - 

(2/9v2m  > 

2/9"2m 

(3.2.22). 

The  quantities  cov  (X^,  X^)  and  cov  (X1#X^) 

could  be  calculated  exactly  directly  from  the  corresponding 

2 

density  function.  However,  since  the  quantities 
and  u x are  only  approximations  it  was  decided  to 
approximate  the  covariances  to  the  same  degree  of  accuracy. 
Thus,  each  of  these  covariances  was  approximated  by  a 
second  order  Taylor  series: 

cov  (X1,X3)  * ^WliJ"1  cov  (Tn,Tj 

= (2/9vin)  (n/m) 

and 

-1 


cov  (X,,X,)  * (9v2nv2m)~  COV  (Dn'D,J 


n m' 


(2/9v2m) 


(3.2.23). 


Substituting  the  expressions  for  the  partial 
derivatives  and  the  covariances  of  equation  (3.2.23)  into 
equation  (3.2.21),  yields  the  following  approximation 


1 


where  and  ; i = 1,...,4  are  given  in  (3.2.22). 

Table  5 contains  covariances  calculated  from  this 
approximate  formula,  for  several  values  of  n and  m. 


The  previous  pages  have  contained  discussions  on 
several  approaches  for  obtaining  cov  (Z  ,Zm).  The 
procedures  considered  were: 

(1)  A direct  integration  of  the  joint  density 

f (T  , T , D , D ) , as  shown  in  equations 
n m n m ^ 

(3.2.12)  - (3.2.13). 

(2)  A direct  integration  of  the  quadr ivariate 
normal  density  f (X1 , X2 , X ^ ) , as  shown  in 
equations  (3.2.14)  - (3.2.15). 

(3)  A Taylor  series  expansion  in  terms  of 

and  F , as  shown  in  equations  (3.2.18)  - 
m 

(3.2.20). 

(4)  A Taylor  series  expansion  in  terms  of 

T , T , D , and  D , as  shown  in  equations 
n m n m 


(3.2.21) 


(3.2.24). 
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As  mentioned  throughout  the  discussions  a procedure  must 
be  chosen  which: 

(1)  Requires  a feasible  amount  of  computation  to 
specify  all  correlations  in  the  matrix  . 

(2)  Can  easily  be  extended  to  the  calculation  of 
correlation  for  X ? 0. 

(3)  Is  at  least  as  accurate  as  the  Wilson-Hilf erty 
and  Paulson  approximations. 


The  approximation  of  procedure  (4)  (or  that  of  equation 


(3.2.24))  does  satisfy  all  the  criteria. 


The  accuracy  of  this  approximation  is  to  the  same 
degree  as  that  of  the  Paulson  approximation,  since 
equation  (3.2.24)  is  such  that 


cov  ( Z , Z ) ~ 1 

n n 


and  the  Paulson  approximation  yields 


cov  (Z  ,Z  ) = var  (Z  ) 

n n n 


The  addition  of  higher  order  terms  onto  the  approxi- 
mation would  yield  answers  indicative  of  the  inaccuracies 
of  the  Paulson  approximation.  The  magnitude  of  these 
inaccuracies  may  be  investigated  by  examining  the  theoretical 

moments  of  Z . The  exact  raw  moments  of  Z are  calculated 
n n 

from  the  following  integration: 


1 /*  °° 

ZnRJ  * / 2n"  fvln,v„  (?„) 


(3.2.25) 
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where  f , (F  ) is  the  density  function  of  an  F 

V.  n 

In  n 


variate  with  v,  , v degrees  of  freedom.  From  these 
Inn 


raw  moments  the  variance,  third  central  moment,  and 


kurtosis  of  Z may  be  calculated.  The  accuracy  of 
n 


the  Paulson  approximation  is  determined  by  the  rate 
at  which 


N 


(z 

v n 


(zn  - uz  i3 


(z„  - N ) 


n -> 


var 


(ZJ 


Since  the  raw  moments  of  Z could  not  be  obtained 

n 


analytically,  numerical  integration  was  employed. 


Table  6 


contains  the  exact  mean,  variance, 


third  central  moment,  and  kurtosis  of  Z , n = 2,..., 30, 


for  the  case  v^n  = k-1  = 1 (the  worst  possible  case) 


Since 

the  assumption 

of 

Z bei 

n 

ng  normal 

iy 

distributed 

seemed  to  be 

more 

serious 

than 

the 

assumptions 

U n — 0 t U— 

= 1,  equation 

(3.2. 

,24)  was 

used 

to 

approximate 

cov  (Z  Z ) . From  this  equation  all  elements  of  the 
n m 


matrix 


i. 


•n,m 


= cov  (Z  , Z ) , were  calculated,  thus 
n m 


TABLE  6 3-53 

Accuracy  of  The  Paulson  Approximation 

I 

I 


INVEST 

ibAUUN  G1  THE 

ACCURAL  V OF 

THE  GEAR 

f r RANSF UK 

MA  T J ON 

USE!' 

to  T 

FANSFORM  A 

CENTRAL  F 

DIET F I BUT 

JON  TO  A NORMAL 

VIA 

ME  T HD U OF 

MUMLN  I S 

FOR 

K = 

n 

I'OF  1 = 

1 

MEAN  1 = 

. 777778 

VAK1  = 

.222222 

EX 

EX 

EX 

rx 

GEARY 

GEARY 

GEARY 

GEARY 

3TF 

D0F2 

MEAN2 

VAR2 

MEAN 

VAR 

3RIi-R-M0M 

4TH-R-  Ml ir 

n 

n 

.888889 

.111111 

0.020011 

0.992184 

0 . 1 '1062 

2.871  90<' 

7 

4 

,944444 

.055556 

0.000214 

0.999909 

0.003815 

2.99/30" 

4 

6 

.962963 

.037037 

0.000002 

0.999996 

0.000060 

2 . 99V95v 

C 

e 

* 97OOOO 

.027778 

-.000000 

0. 999998 

0. 000001 

2 , 99999/ 

fc 

10 

. 977778 

♦ 0?2222 

0.000000 

0.999998 

-.000000 

2 . 9999  ' ' 

7 

12 

. 981481 

.018519 

0.000000 

0 . 999999 

0.000000 

2#999Qv 

8 

14 

. 984127 

.015873 

-.000000 

0.999998 

-.000000 

2 . 9999' 

9 

16 

. 98611 1 

.013889 

-.000000 

0.999998 

-.000000 

2 . 9 9 V V 

10 

18 

.987654 

.012346 

0.000000 

0.99999 7 

0.000001 

3.000009 

11 

20 

. 988889 

.011111 

-.000000 

0.999999 

- .000000 

2 . 999990 

12 

n n 

4.  4. 

.989899 

.010101 

-.000000 

0 . 999997 

-.000000 

3.00000" 

13 

24 

.990741 

.009259 

-.000000 

0.999997 

0.000000 

3.000000 

14 

26 

.991453 

.008547 

0.000000 

0.999998 

0.000000 

2.999990 

15 

28 

. 992064 

.007937 

-.000000 

0.999998 

- .000000 

3.00000 

16 

30 

,992593 

.007407 

-.000000 

0.999997 

- .000000 

2.9999V. 

17 

32 

.993056 

. 006944 

-.000000 

0.999998 

0 . 000000 

2 . 99999 

18 

34 

. 993464 

.006536 

0.000000 

0.999999 

0.000000 

2.9999‘.  ■ 

19 

36 

. 993827 

.006173 

0.000000 

0 . 999999 

- . 000000 

2 . 999991 

20 

38 

.994152 

. 005848 

0.000000 

0.999999 

0.000000 

2 . 9999V* 
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completely  specifying  the  approximate  distribution  of 
^ = tZ2'Z3 Zm J ' 


In  summary  the  multivariate  normal  approximation 
(for  A = 0)  involves  the  following  approximations: 


(1) 


Assuming  the  random  variable  Z = g(F  ) , 

n n 

given  in  equation  (3.2.9)  is  normally  distrib- 
uted with  mean  zero  and  standard  deviation  one. 

r 


(2) 


Assuming  the  joint  distribution  of  Z = 
is  multinormal  with  mean  vector  zero  and 
variance  covariance  matrix  its  density 

being  that  of  equation  (3.2.11). 


Z 2 ' Z 3 ' 


(3)  Assuming  the  elements  of  J, 


I 


n ,m 


are  given 


m. 


by  the  Taylor  series  approximation  equation  (3.2.24). 


To  evaluate  the  overall  accuracy  of  the  MVN  (for  A = 0) 


approximation 


Monte  Carlo  techniques  were  employed. 


For  a given  SANGVA  test  consisting  of  regions  V^1,  Vp1 
transformed  regions  Z 1 , Z 1 were  calculated. 


i = 2,...,m  . Next,  the  elements  of 


-n , m 


were  calculated 


for  all  n = 2,...,m^  and  m = 2,...,m^.  Then,  random 


vectors  Z = 


Z Z 

2.  nu 


were  generated  from  a multi- 
normal distribution  with  mean  vector  zero  and  variance- 
covariance  matrix  \ (Naylor  et  al  (1966)).  Each  vector 


3-  S 5 


generated  was  sequentially  scanned  until  either  of  the 
following  inequalities  was  satisfied: 


Z < Z " 
1 - A 


z i * ZR  ; i - 2,  . . . ,n»0. 


The  end  result  of  the  simulation  consisted  of  the 
quantities  f 1,  f 1 (see  Section  3.1),  from 

A K 

which  OC  ( A = 0)  and  ASN  (A  = 0)  could  be  calculated. 

A A 


Table  7 


contains  the  results  of  a simulation 


for  the  MVN  approximation  (X  = 0)  of  the  SANOVA  test 


given  in  Table  1 . 


These  results  should  be  compared 


with  those  given  in  Table  3 


'those  obtained  by 


direct  simulation  of  the  SANOVA  tests  for  X = 0) . 


i ^i 


Although  the  frequencies  fft  , fR  , i=n^,...,m  , 
differ  for  each  of  the  two  simulations,  the  summary 
statistics,  ASN  and  OC  , are  in  remarkably  good  agreement. 
This  is  shown  by  the  following  comparison  table  (for  A=0) : 


MVN  approx. 

Monte  Carlo:  7.763 


SANOVA 

Monte  Carlo:  7.463 


0.9006 


0.9006 


TABLE  7 


Simulation  Results  For  MVN  Approximation  To  SANOVA  Test#l 

For  A = 0 


MULTIVARIATE  NROMAL  AFPOX IMATION  TO  SEQUENTIAL  ANALYSIS  OF  VARIANCE 


LAMBDA  = 0 


PROBABILITIES  OBTAINED  VIA  MONTE  CARLO  INTEGRATION  OF  MULTIVARIATE 

NORMAL 


N 

PROB  ACCEPT 

PROB  CONTINUE 

PROB  REJECT 

PROB  TERM 

n 

. OOOOOOOE  01 

. 1 OOOOOOE  01 

.OOOOOOOE  01 

.OOOOOOOE  01 

3 

.OOOOOOOE  01 

. 9989000E  00 

. 1 100000E-02 

. 1 100000E-02 

4 

.OOOOOOOE  01 

. 9R74000E  00 

. 1 150000E-01 

. 1 150000E-01 

er 

-*-2687330F~Q0- 

.»7-0816>70E  -00 

. 1 050000E— -0 1 

.2792330E  00 

6 

. 1 608670E  00 

.5388670E  00 

.8433330E-02 

.1693000E  00 

7 

. 1 1 82000E  00 

. 4 1 44670E  00 

• 6200000E-02 

• 1 244000E  00 

8 

. 9083330E-01 

.31 89000E  00 

.4733330E-02 

. 9556670E-01 

9 

.6986670E-01 

. 2455330E  00 

. 3500000E-02 

. 7336670E-01 

10 

. 5270000E-0 l 

. 1 898000E  00 

. 3033330E-02 

. 5573330E-0 1 

11 

. 3743330E-0 1 

• 1 467330E  00 

. 5633330F-02 

. 4306670F-01 

12 

. 3080000E-0 1 

.1096330E  00 

.6300000E-0? 

. 371 OOOOE -0 1 

13 

. 3050000E-0 1 

. 7050000E-0 1 

.8633330F-02 

. 391 3330E-01 

14 

. 2426670E-01 

. 3426670E-01 

. 1 1 96670F>0 1 

. 3623330E -0 1 

15 

. 1640000E-01 

.OOOOOOOE  01 

. 1 786670E-01 

. 3426670E-01 

PROBABILITY  OF  ACCEPTING  HO  = .9006 

PROBABILITY  OF  REJECTING  HO  = .0994 

AVERAGE  SAMPLE  NUMBER  = 7.76317 

TOTAL  NUMBER  nF  MONTE  CARLO  TRIALS  = 


30000 
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It  should  also  be  noted  that  this  particular 
example  is  one  of  the  most  difficult  for  the  MVN 
approximation.  Since  all  of  the  normalizing  approxima- 
tions improve  with  increasing  degrees  of  freedom,  the 
MVN  approximation  will  improve  for  larger  values  of  K. 
Thus,  the  MVN  approximation  is  a reasonable  approximation 
to  consider.  However,  to  be  useful,  the  approach  must 
be  extended  for  A ^ 0. 

When  A ? 0,  the  distribution  of  becomes  a 
2 

noncentral  X variate  with  noncentrality  parameter  nX 

and  K-l  degrees  of  freedom.  D is  still  distributed 

n 

2 

as  a central  X variate  with  K(n-l)  d.o.f.  Thus,  F 

n 

becomes  a noncentral  F variate  with  noncentrality 
parameter  nA  and  K-l,  K(n-l)  degrees  of  freedom. 

A normalizing  transf ormation  must  now  be  found  for  a 
noncentral  F variate. 

Finding  a normalizing  transformation  for  a noncentral 

F variate  requires  first  finding  a normalizing  trans- 

2 

formation  for  a noncentral  X variate.  Sankaran 
( 1963  ) considered  the  problem  of  determining  an  index 
h which  optimally  normalizes 

W = X 2(A)  / (v+A)  h . 

V 
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He  showed  that  for 

h = 1 - (2/3)  (v+X)  (v+3X)  (v+2A)“2  (3.2.26). 

W is  approximately  normally  distributed  with  mean 

y (h)  = 1 + h(h-l) (v+2X) (v+A)”2 

w 

and  variance 

a 2 (h)  = 2h2 (v+2A) (v+A) ”2 

Vv 

In  fact,  1/3  <_  h £ 1/2  agreeing  with  the  Wilson- 

Hilferty  value  h = 1/3,  when  A = 0,  and  steadily 

increasing  as  X increases.  Although  this  normalizing 

2 

transformation  works  well  for  a noncentral  X variate, 
a dilemma  arises  when  trying  to  normalize  the  noncentral 
F variate. 

To  apply  Geary's  transformation  both  the  numerator 
and  denominator  of  F must  be  normalized,  where 

F = X2v1^)/V1 


Clearly,  the  optimal  normalizing  power  for  the  denominator 
is  always  1/3,  whereas  it  is  the  h given  by  equation 
(3.2.26)  for  the  numerator.  However,  for  large  values 

2 p 

of  v 2 , (X^v  /v2^  to  any  power  1/3  <_  P <_  1/2  will  be 
approximately  normally  distributed. 


K-l 


Since  a SANOVA  test  has  = K(n-l)  >>  = 

the  choice  of  the  power  so  as  to  normalize  the  noncentral 

F variate  is  dominated  by  that  which  normalizes  the 
2 

noncentral  X variate.  Thus,  the  following  quantity: 


\ lh 

IhnVhn*"*)  I " 

' V( 


. . \ / 2h  2 2 h 

,v,  F / v + nX  ) « no_  +o 
I In  n \ In  ) ^ 2n  In 


wi  th 


and 


2 n 


1 <)(h„-11/V2„("  1 t{h(hn(V1)/K<n-11l 


2 n 


2h  ‘ /v„ 
n 2 n 


2h  / K (n-1) 
n 


v.  = K(n-l) 
2n 


U.  = 1 + h (h  - 1 ) ( v + 2nX 

In  n \ n ' in 


) ("in*"*) 


-2 


2 ■ 2h2  (''in*2'")  K„+n») 


1 n 


v = K-l 
1 n 


hn  1 (2/J)  (vln+nx)  (vin+3nx)  (vlnt-2ni)  2 
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I 


will  be  assumed  to  be  approximately  normally  distributed 
(Mudolkar , Chaubey,  Lin  (1976)).  (Note  this  equation 
becomes  equation  (3.2.9)  when  X = 0.) 

As  in  the  previous  discussion,  the  joint  distribution 
will  be  assumed  to  be  multinormal  with 


of  Z~,»..,Z 
2 m 


0 


mean 


vector  zero  and  variance-covariance  matrix  7, 

r~ 


the  elements  of 


-n  ,m 


= cov  (Z  , Z ) . 

n m 


Consider  the  quantities  Z and  Z expressed  in 
^ n m 

the  following  form: 


^Xlny2n  " ulnX2n' 


2 2 
“In  °2n 


2 2 ' x- 
“2n  °ln 


and 


*XlmU2m  ~ UlmX2m'/ 

™ ]x  2o  2 + X 2o  2 

*Xlm  2m  2m  lm 


where 


xn  * iv^r1  ^ 


“2i 


v0.  D. 
2 1 l 


(3.2.28) 


with  _ K-l,  = K(i-l)  and  T-,  D..  are  defined 

are  defined  on  page  3-8. 


I 

I 
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To  calculate  cov  (Z  , Z ) the  Taylor  series 

n it 

approximation  of  equation  (3.2.24)  will  again  be  used. 
This  approximation  becomes: 


where  the  partial  derivative  rotation  is  defined  in 
equation  (3.2.16) . 


ana 


cov 


(JW 


and 


cov 


1X2n'X2m> 


COV 


m 


-h 


-h  h hi 

(v,+nA)  n (v.+mA)  cov  (T  n T ) 
in  lm  n m 


cov  f (D  /v_  ) 

v n j n 


(D  / v 
m 2m 


h h 
r _ n __  ni\ 

cov  ID  D 

y n m 1 


( 3. 2.  30)  . 
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Approximating  these  covariances  by  a Taylor  series 
yields 


cov 


<Vta: 


and 

COV  (X2n'V 


(v:+ni)  (v^+mA) 


-1 

h h 
n m 

cov 

-1 

h h 

2v, 

n m 

In 

(v_  v_  ^ h h cov  (D  ,D  ) 
v 2n  2m  n m n m. 


(v.  v_  ) ^ h h f 2v_  } 
v 2n  2m-'  n m 1 2n  ’ 


(3.2.31) 


Thus,  by  making  the  necessary  substitutions  the 

approximation  to  cov  (Z  . Z ) becomes: 

n m 


cov  (Z  , z ) 
n m 


_i 

hnhm  ^m<Mn)^4nA  (v^nA)  (v^)  M2nM2m+2v2n  Wlmhnhm 


2 2 2 2 
u,  o_  + p_  o, 
lm  2n  2n  In 


2 2 2 2 V, 

^ lm  L 2m  ^ 2m  0 lm 


( 3. '2. 32) 
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This  approximation  reduces  to  that  of  equation 
(3.2.24)  whenever  X = 0.  As  with  the  case  X = 0, 
this  approximation  for  cov  (Z^,  2^)  is  of  the  same 
degree  of  accuracy  as  the  normalizing  transformation 
of  (3.2.26),  since 

cov  (7.  Zn)  = VAF  (Zn)  = 1 • 

Thus,  the  approximation  of  equation  (3.2.32)  can 
be  used  to  calculate  the  elements  of  the  correlation 
matrix,  £,  for  all  values  of  X.  Having  this  correla- 
tion matrix  completely  specifies  the  approximate  distri- 
bution of  the  vector  / = (Z  7 , . . . , Z ^ ) g iven  in  (3.2.11). 

To  investigate  the  accuracy  of  the  MVN  approxi- 
mation a Monte  Carlo  simulation  was  performed.  This 
simulation  consisted  of  applying  the  MVN  approximation 
to  the  SANOVA  test  given  in  Table  1 . For  a given  value 
of  X this  involved: 

(1)  transforming  the  SANOVA  test  regions  with  the 
transformation  of  (3.2.21). 

(2)  calculating  the  elements  of  the  correlation 
matrix,  l,  from  equation  (3.2.32). 

(3)  generating  vectors  ['  = (Z_,...,Z  ) from  a 

0 

multivariate  normal  distribution  with  mean 


i 

, 

•I 


j 


vector  zero  and  correlation  matrix 


3-6  4 


(4)  sequentially  scanning  the  elements  of  ~J_ 


unti  1 


Z . > Z 

1 R 


z . Iz 

l A 


1 = 2 v 


The  results  of  the  simulation  are  given  in  Table  8 • 
Table  9 contains  a comparison  of  the  ASN  and  OC  curves 
for  the  SANOVA  test  and  the  MVN  approximation.  As  seen 
from  this  table  the  MVN  approximation  gives  remarkably 
accurate  approximations  to  the  OC  and  ASN  curves  of 
a SANOVA  test.  Also,  it  must  be  remembered  that  this 
particular  SANOVA  test  (a  K=2  test)  is  a worse  case  for 
the  MVN  approximation;  more  accurate  approximations  will 
be  obtained  for  larger  values  of  K. 

The  MVN  approximations  to  OC  and  ASN  need  not 
be  obtained  by  Monte  Carlo  simulation.  An  alternative 
is  to  obtain  approximations  to  P x,  P 1 by  direct 
integration.  As  discussed  previously,  exact  calculation 
of  the  probabilities  requires  integrating  the  joint  density 
f (F^ , . . . ,F^ ) , as  shown  in  (3.2.3).  Due  to  the  form  of 
the  joint  density  f (F^,...,F^),  such  calculations  are 
impractical.  However,  these  quantities  may  not  be  approxi- 


ited  by  integration  of  the  much  simpler  density  f (Zjr-'-jZ^) 


z> 

z. 
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TOTAL  NUMBER  OF  MONTE  CARLO  TRIALS  = 30000 
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These  approximations  are  simply: 


(3.2.33) 


where  f {Z^,...,Z/i  is  the  multinormal  density  given 
in  (3.2.11)  . 

Unfortunately,  the  integrations  can  not  be  done 
analytically,  and  thus  must  be  obtained  by  numerical  inte- 
gration. For  special  forms  of  the  correlation  matrix,  £, 
or  special  types  of  integration  regions,  the  dimension  of 
the  integration  can  be  reduced  (Johnson  and  Kotz  (1972)). 
However,  none  of  these  techniques  are  applicable  to  the 
integrals  given  in  (3.2.33).  Therefore,  evaluation  of 
these  integrals  requires  an  ( i — 1 ) dimensional  numerical 
integration. 


For  a SANOVA  test  involving  a large  number  of 


stages  at  which  a decision  can  be  made,  direct  inte- 
gration of  f (Z  becomes  impractical  due 

to  the  tremendous  number  of  high  dimensional  numerical 
integrations  required.  Thus  Monte  Carlo  simulation 
becomes  the  only  practical  method  of  evaluating  the 
integrals  required  for  the  MVN  approximation. 

One  may  argue  that  Monte  Carlo  simulation  of  the 
MVN  approximation  seems  an  indirect  approximate  met  lod 
as  compared  with  direct  Monte  Carlo  simulation  of  the 
SANOVA  test.  However,  the  advantages  of  Monte  Cc.rlo 
simulation  of  the  MVN  approximation  are  two-fold. 
First,  for  large  values  of  K,  the  MVN  simulation 
will  require  less  time  than  the  SANOVA  simulation. 
Second,  the  MVN  simulation  utilizes  the  multinormal 
distribution  which  is  currently  one  of  the  most  widely 
investigated  multivariate  distributions  in  statistics. 
Thus  improved  simulation  techniques  (i.e.,  faster 
generators,  importance  sampling,  etc.)  for  the  MVN  can 
easily  be  adapted. 
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3.3  CONCLUSION 

This  chapter  of  the  thesis  has  considered  approxi- 
mations for  the  OC  and  ASN  curves  of  a SANOVA  test 
with  K > 2 means. 

The  MVN  approximation  is  a new  approximation  which 
gives  remarkably  accurate  results.  The  approximation 
has  been  developed  in  sufficient  generality  to  be  valid 
for  any  number  of  means  and  all  values  of  X.  The  proce- 
dure requires  less  computation  than  standard  Monte  Carlo 
simulation  for  SANOVA  tests  with  a large  number  of  means. 
Also,  the  approximation  will  be  more  attractive  as  advances 
in  either  numerical  integration  or  approximations  for 
MVN  probabilities  are  made. 
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CHAPTER  4 

THE  EFFECTS  OF  DEPARTURES  FROM  THE 
UNDERLYING  ASSUMPTIONS  IN  SANOVA 

4.0  INTRODUCTION 

In  the  derivation  of  parametric  tests  it  is  usual  to 
assume  a form  of  mathematical  model  involving  some  specific 
probability  distribution.  In  practical  circumstances  in 
which  statistical  tests  are  applied,  little  is  usually 
known  of  the  validity  of  such  a model  required  for  the 
procedure.  The  investigation  of  the  sensitivity  of  the 
procedure  to  violations  in  the  assumptions  termed  ("robust- 
ness," Box  (1955))  has  been  an  area  of  intensive  research, 
for  fixed  sample  procedures.  Several  papers,  Ewens  (1961), 
Bhattachar jee  and  Nagendra  (1964)  have  investigated  the 
sensitivity  of  the  sequential  tests  to  departures  from 
assumptions . 

The  purpose  of  this  chapter  is  to  study  the  effects 
of  violations  of  the  following  assumptions  made  in  SANOVA: 

(i)  equality  of  variance  of  the  errors 
(ii)  normality  of  the  errors. 

A study  of  this  kind  cannot  be  exhaustive,  for  one 
reason,  because  assumptions  like  this  can  be  violated  in 
many  more  ways  than  they  can  be  satisfied.  Therefore, 
the  violations  will  be  treated  one  at  a time. 


I 
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1 

4.1  DEVIATIONS  FROM  HOMOGENEITY  OF  VARIANCE 

First  consider  assumption  (i),  that  of  homogeneity  of 

variances.  The  assumed  model  is  that  the  k populations  can 

be  described  by  a normal  distribution  with  mean  and 

variance  o ' the  homogeneity  of  variance  assumption  being 

2 

that  each  population  has  a common  variance  o . Departures 
from  this  assumption  occur  when  the  populations  have  vari- 
ances  cr  , not  all  being  equal.  A series  of  papers 

have  considered  the  effect  of  departures  from  this  assump- 
tion in  the  fixed  sample  analysis  of  variance  test.  Box 
(1953),  F.N.  David  and  N.L.  Johnson  (1951),  Horsnell  (1953), 

Gronow  (1951),  Brown  and  Forsythe  (1974),  Kohr  and  Games 
(1974),  Box  and  Andersen  (1955). 

Box  (1953)  showed  that  the  degree  of  departure  from 

the  assumption  could  be  characterized  by  the  square  of  the 

2 

coefficient  of  variation  of  the  variances,  c . That  is  to 
2 

say,  c is  the  variance  of  the  variances  divided  by  the 

—2 

square  of  the  mean  of  the  variances  o : 


c = 


1 v , 2 -2.  . ,-2.  2 

— I ( o a )/ (a  ) 


(4.1.1) 


where 


o = 


(lap/k 


The  exact  distribution  of  the  F statistic  under  : 
X = 0 when  the  variances  were  not  all  equal  was  obtained 
by  Robhins  and  Pitman  (1949)  as  an  infinite  series  of 


F distributions. 
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Box  (1953)  showed  that  for  the  case  of  equal  ru  = n, 
the  distribution  of  the  F statistic  under  H : X = 0 
could  be  approximated  by 

F { (k-1) c' , k (n-1) c } 


where  e ' = { 1+? 


} and  c = (1+c  ) 


Although  the  approximation  was  not  of  great  accuracy,  it 
did  fathfully  indicate  the  order  and  direction  of  the 
effects  of  departures. 

Since  c ' and  e are  less  than  unity  when  the 
variances  are  not  equal  the  significance  of  effects  tends 
to  be  overestimated,  resulting  in  a larger  a. 

The  findings  of  Box  and  others  concurred  in  demon- 
strating that  deviations  from  the  assumption  of  homo- 
geneity of  variances  had  very  little  effect  on  a,  when 
the  n/s  were  equal  and  of  reasonable  size. 

Horsnell  (1953)  investigated  the  effect  of  unequal 
group  variances  on  the  overall  power  of  the  fixed  sample 
test.  His  investigation  indicated  that  the  power  curve 
was  not  severely  affected  if  one  replaced  the  usual 
noncentrality  parameter  X,  where 
k 


X = 


JlMPi-p- 


) 


with  a modified  centrality  parameter  X'; 


where 
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^ n ( u . - U ) 

L = 1 1 


Overall  consensus  is  that  the  fixed  sample  analysis 
of  variance  test  is  robust  with  regard  to  the  homogeneity 
of  variance  assumption. 

As  a first  step  toward  investigating  the  sensitivity 
of  SANOVA  to  deviations  from  equality  of  variance,  a 
Monte  Carlo  simulation  study  was  performed.  For  this 
study  several  sequential  tests  were  chosen,  and  the  OC 
and  ASM  curves  were  obtained  via  Monte  Carlo  simulation. 
Simulations  were  then  conducted  to  obtain  the  OC  and  ASM 
at  Hq  under  several  alternatives  to  the  assumption  of 
homogeneity  of  variances.  The  results  of  the  study  are 
summarized  in  Tables  10  and  11. 

These  results  indicate  that  the  SANOVA  test  is  fairly 
robust  to  deviations  from  equality  of  variances  assumption. 
Also,  as  in  the  fixed  sample  test,  the  effect  can  be 
characterized  by  the  square  of  the  coefficient  of  variation 
of  the  variances  of  equation  ( 4.1.1).  The  magnitude  of 
the  effect  may  be  theoretically  approximated. 
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TABLE  10 


MONTE  CARLO  INVESTIGATION  OF  HOMOGENEITY  OF  VARIANCE  ASSUMPTION 


Thesis  Example:  Test  #2  (i.e.,  the  test  of  Table  2) 


Under  HQ:  Lambda  = 0 


# 0, 

2 

-2 

2 

case 

°2 

0 

0. 

1 

PAC 

ASN 

0 

C 

1 

1.00 

1.00 

0.000 

.9826 

14.31 

1.000 

0.000 

2 

1.00 

1.47 

0.055 

.9814 

14.35 

1.581 

0.135 

3 

1.00 

1.67 

0.111 

.9803 

14.30 

1.895 

0.223 

4 

1.00 

1.82 

0.167 

. 9800 

14.37 

2.156 

0.288 

5 

1.00 

1.94 

0.222 

.9794 

14.33 

2.382 

0.337 

6 

1.00 

2.05 

0.278 

.9795 

14.33 

2.601 

0.379 

7 

1.00 

2.15 

0.333 

.9790 

14.34 

2.811 

0.415 

8 

1.00 

2.25 

0.389 

. 3793 

14.32 

3.031 

0.449 

9 

1.00 

3.00 

1.000 

.9784 

14.34 

5.000 

0.640 

10 

1.00 

4.00 

2.250 

.9766 

14.33 

8.500 

0.779 

11 

1.00 

5.00 

4.000 

.9761 

14.33 

13.000 

0.852 

12 

1.00 

6.00 

6.  250 

.9748 

14. 36 

18.500 

0.895 

13 

1.00 

7.00 

9.000 

.9742 

14.34 

25.000 

0.922 
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TABLE  11 


MONTE  CARLO  INVESTIGATION  OF  HOMOGENEITY  OF  VARIANCE  ASSUMPTION 

A TEST  WITH  k=3  MEANS 


All 

Cases  for 

V 

Lambda 

= 0 

case 

# 0^ 

°2 

°3 

2 

0 

°i 

PAC 

ASN 

c2 

1 

1 

1 

1 

0.00 

.9835 

16.47 

2 

1 

1 

3 

0.89 

.9526 

15.49 

1.058 

3 

1 

1 

4 

2.00 

.9446 

15.14 

1.39 

4 

1 

1 

5 

3.56 

.9384 

14.96 

1.58 

5 

1 

2 

4 

1.56 

.9597 

15.74 

0.85 

6 

1 

2 

5 

2.89 

.9522 

15.44 

1.14 

7 

1 

1 

2 

0.22 

.9688 

16.02 

.50 
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As  previously  discussed  the  SANOVA  tests  could  be 

derived  in  terms  of  the  statistic  F rather  than  V , 

n n 

where 

("I 

Fn  * ~^—n 

The  expected  value  of  Fr  when  the  variances  are  not  all 
equal  is  given  by 

k (n-1) {nA+k-1 } 

E[Fp]  = 2 F1(l,^k(n-1) ,k(n-l) ,C  / 

(k (n-1) -2) 

2 

where  c is  the  square  of  the  coefficient  of  variance 
of  the  population  variances,  and 


I (v^v)2 

A = ^ 

-2 

a 

and  F^ (a,b,c, z)  is  the  Gaussian  hypergeometric  function 

00  (a)  (b)  zn 

F.  (a ,b, c , z)  = I 

1 1 n=0  (c)n  nl 

T(c)  00  r(a+n)T(b+n)  zn 

[ 

r (a) T (b)  T(c+n)  n! 
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so 


k (n-1) 
k ( n— 1 ) -2 

k (n-1) 
k (n-1)  -2 


k (n-1) 


(nA+k-1) {1+ — - 

k (n-1) 

2] 

(n  +k-l) (1+hc  ) 


(4.1.2) 


In  general,  the  exact  distribution  of  Fn  when  the 
variances  are  not  all  equal  is  a complicated  infinite 
series  of  non-central  F distributions. 

An  approximation  to  the  distribution  of  F^  is  a 
noncentral  F distribution  with  degrees  of  freedom  k-1 


and  k(n-l)  and  noncentrality 


E 

F * (k-l,k(n-l) ) 

= E 

r 

F 

nA 

n 

parameter  X , 
This  results  in 


chosen  so 


(4.1.3) 


If  this  expression  did  not  contain  n,  the  power  of 

the  sequential  test  when  the  variances  were  not  all  equal 

* 

could  be  approximated  by  8(A)  where 

* ( * 

6 ( X ) * Prob  / rejecting  H0  | X=X  when  the  assumptions  are 


satisfied. 
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An  alternative  is  to  approximate  the  power  by  6(A"), 


where 


, , _ r I , , c 2 I k-l  2 

A A j1  2 | 2 jASN(A))2 


(4.1.4 


ASN(A)  = average  sample  number  when  A = A 

when  the  assumptions  are  satisfied. 

The  effect  of  deviations  from  homogeniety  of  variance 
on  the  power  of  SANOVA  will  not  only  depend  upon  the  means 
jj i , ^2  ' ‘ ' ’ / and  the  standard  deviations  °i ' °2  ' * * ’ ' °k  ' 
but  also  upon  the  following  factors: 

1.  The  combinations  u^,ck 

2.  The  number  of  means,  k 

3.  The  regions. 

All  but  factor  1 are  taken  into  consideration  in  the 
above  approximation.  The  effects  of  factor  1 will  be 
considered  in  the  next  section. 

Factor  3 is  taken  into  account  in  the  following  way: 
the  power  under  a deviation  from  the  assumption  is  in  terms 
of  a power  when  the  assumption  is  true,  which  is  completely 
determined  by  the  regions. 

Table  12  contains  the  results  of  this  approximation 


for  the  simulations  of  Table  10. 


As  seen  from  this 


table  the  approximation  is  fairly  accurate. 
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TABLE  12 


Departures  from  Normality  Assumption 
Test  #1 


2 

case  # C 

of 

equation  (4.1.4) 

N 

<< 

CQ 

Observed 

1 

0.135 

0. 0003 

0.9818 

0.9814 

2 

0.223 

0.0005 

0.9813 

0.9803 

3 

0. 288 

0.0007 

0 .9808 

0.9800 

4 

0.337 

0.0008 

0.9806 

0.9794 

5 

0.  379 

0.0009 

0.9803 

0.9795 

6 

0.415 

0.0010 

0.9800 

0.9790 

7 

0.449 

0.0011 

0 .9793 

0.9793 

8 

0.640 

0.0016 

0.9786 

0.9784 

9 

0.779 

0.0019 

0.9779 

0.9766 

10 

0.852 

0.0021 

0.9773 

0.9761 

11 

0.895 

0.0022 

0.9771 

0.9748 

12 

0.922 

0.0023 

0.9766 

0.9742 

t 

Obtained  by 

interpolation  of  OC 

curve  of 

Table  4. 

- " r — . : — r 


I 

I 

I 

I 


k- 
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4.2  DEVIATIONS  FROM  NORMALITY 

In  considering  the  effects  of  nonnormality 
it  is  convenient  to  use  the  measures  Y ^ of  skewness 
and  of  kurtosis  of  the  distribution  of  the 

random  variable  X.  These  quantities  are  defined  as: 

Y1  = o'3  E [ (X-U) 3] 

>2  = o"4  E [ (X-4 ) 4 ] -3  . ( 4.2.1  ) 

Other  commonly  used  measures  are 


for  the  magnitude  of  skewness,  and 


for  kurtosis. 

For  a symmetrical  distribution  Y^  = 0.  Positive 
values  of  Y^  indicate  the  distribution  is  "skewed 
to  the  right."  Every  distribution  has  Y 2 — ~ ^ ' 
with  normal  distribution  have  Y^  = 0.  Distributions 
which  have  heavier  tails  and  a central  part  more  peaked 
than  the  normal  have  y ^ > 0 ; those  in  which  the  tails 
are  lighter  and  have  a central  part  flatter,  have  Y2  < 0. 

Although  the  first  four  moments  of  a population 
do  not  determine  its  form  entirely,  the  value  of  Y2 
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and  to  a baser  degree  the  value  of  are  the  most 

important  indicators  of  the  extent  to  which  nonnormality 
affects  the  usual  inferences  made  in  ANOVA. 

Most  of  the  studies  of  the  effect  of  nonnormality 
on  fixed  sample  ANOVA  tests  have: 

(1)  assumed  that  all  distributions  have  the  same 

skewness  y^,  and  the  same  kurtosis  Y2i; 
i.e.,  YU  - y21  » •••  - YU  * y1 

and  Y21  - Y22  ■=  •••  - Y2k  - Y2. 

(2)  dealt  only  with  type  -I  errors. 

Whenever  each  population  has  the  same  nonnormal 
distribution  (possibly  differing  only  with  respect  to 
location)  the  estimates 


X X (Xi5  ' Xi<n>’2  7 Kln~1> 


still  continue  to  provide  unbiased  estimates  of 
population  variance,  but  they  are  no  longer  independently 
distributed.  In  fact,  David  and  Johnson  (1951)  showed: 

cov  (STn'SBn>  = Y2(I<2n  " nK  + K + r2>  (n~1)  / (K-l)n  . 

(4.2.2  ) 
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Since  these  quantities  are  not  independent,  the 
distribution  of  the  ratio  is  obviously  no  longer  an 
F distribution.  Gayden  (1950)  approximated  the 
distribution  by  an  edgeworth  expansion  for  the  case 
y = 0;  and  obtained  correction  terms  for  calculating  an 
approximate  type  I error  given  values  of  y^  and  y ^ . 

His  calculations  revealed  that  the  effect  of  nonnormality 
diminishes  rapidly  in  magnitude  with  increasing  sample 
size,  and  also  the  effect  of  kurtosis  was  larger  than 
that  of  skewness.  His  conclusions  generally  agreed 
with  previous  sampling  experiments  (Pearson  (1931))  and 
later  Pearson  curve  approximation  (David  and  Johnson 
(1951) ) . 

Box  and  Andersen  (1955)  employed  permutation  theory 
(Fisher  '1935))  as  a means  of  assessing  the  effects  of 
nonnormality  on  the  Type-I  error  of  an  ANOVA  test.  This 
resulted  in  the  effect  being  represented  by  a modification 
of  the  degrees  of  freedom  (v^m,  v^m)  of  the  F distribution. 
The  degrees  of  freedom  are  modified  in  the  following  manner: 


v2m  = d(n-l) 

where 

N+l  C- 

d = 1 + — 

N-l  N-C2 

with  N = K (n-l) 
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and 

C2  = Y2  ~ N_1  { 2t4  - 3y22  + 10y2  + 12YX2  } 

+ N-2  { 3y6  - 16y4Y2  + 15y23  + 36y4  + IZC^Y-f  88Yi2Y2  + 66y2  + 204Yl2} 

(y.  being  the  standardized  cumulants) . (4.2.3  ). 

The  modification  turns  out  to  be  minor  except  for  very 
small  values  of  N. 

As  previously  mentioned  the  effect  of  nonnormality 
on  the  probability  of  type-II  errors  has  not  received  much 
attention.  However,  a sampling  investigation  for  the  two- 
tailed  t-test  for  a single  mean  (Pearson  (1929))  revealed 
that  there  was  little  effect  on  the  power  caused  by  non- 
normality . 

In  conclusion,  it  appears  as  though  general  consensus 
amongst  statisticians  is  that  the  fixed  sample  ANOVA  test 
is  remarkably  insensitive  to  nonnormality  (at  least  the 
types  of  nonnormality  considered) . 

Ewens  (1961)  considered  the  effect  of  nonnormality 
on  the  sequential  test  for  a normal  mean.  He  derives 
modified  Wald  approximations  to  the  OC  and  ASK  for 
such  a test,  when  the  assumption  of  normality  is  violated. 
His  results  reveal  that,  as  in  the  fixed-sample  tests, 
the  sequential  test  for  means  is  comparatively  robust 
in  respect  to  departure  from  assumed  normality. 


L ...  . _z  . . 
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Evidently,  the  robustness  of  a fixed  sample  test 

to  deviations  from  assumptions  is  an  indication  of  the 

robustness  of  the  analogous  sequential  test  to  the  same 

deviations.  This  statement  appears  to  hold  for  tests 

of  means  (Ewens  (1961),  Bhattachar j ie  and  Nagendra  (1964)) 

and  variances  (Wald  (1947),  Ewens  (1961)),  at  least. 

To  investigate  the  sensitivity  of  SANOVA  to  deviations 

from  the  normality  assumption,  Monte  Carlo  techniques 

were  employed.  For  this  study  several  sequential  tests 

were  chosen,  and  the  OC  and  ASN  curves  obtained  via 

Monte  Carlo  simulation.  This  simulation  being  conducted 

under  the  assumption  of  all  populations  having  normal 

2 

distributions  with  mean  u.  and  common  variance  o . 

i 

A deviation  from  the  assumption  of  normality  involved 

selecting  a distribution  from  the  Johnson  (1949)  system 

of  frequency  curves.  This  system  consists  of  three  classes 

of  distributions,  S , S , and  S , which  provide  one 

L U B 

distribution  corresponding  to  each  pair  of  values  /(T1  and 
Q ; i.e.,  there  is  just  one  appropriate  distribution 
corresponding  to  each  (B^,B2)  point.  Thus,  a given  value 
of  3 ^ , B2  specifies  the  type  of  Johnson  distribution. 

Thus,  for  a given  sequential  test  the  change  in  a 
was  observed  for  different  values  of  B-^  < B2*  The  simulation 
for  any  given  pair  B^  8 2 involved  generating  the  variates 
from  the  appropriate  Johnson  distribution  (1949). 


Tables  13  and  14  contain  the  OC  and  ASN  values 
for  several  SANOVA  tests  with  deviations  from  normality. 


These  tables  contain  the  deviational  , 
together  with  the  established  a. 


B2  pairs 
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TABLE  13 


DEPARTURES  FROM  NORMALITY  ASSUMPTION 
SANOVA  TEST  #1 


ase  # 

h 

62 

PAC 

ASN 

1 

0.00 

3.00 

0.9006 

7.463 

2 

4 . 00 

12 . 00 

0.9047 

7-655 

3 

4 . 00 

14 . 00 

0.9014 

7.695 

4 

4.00 

16.00 

0.9023 

7.692 

5 

4 .00 

18.00 

0.9042 

7.703 

6 

4 . 00 

20.00 

0.9068 

7.716 

7 

6.25 

14 . 00 

0.9060 

7.742 

8 

6.  25 

16.00 

0.9025 

7.712 

9 

6.25 

18.00 

0.9066 

7.740 

10 

6.25 

20.00 

0.9073 

7.730 

11 

9 . 00 

24 . 00 

0.9048 

7.783 

12 

25.00 

75.00 

0.9106 

7.938 

13 

1.00 

5.00 

0.9018 

7.  553 

14 

0.00 

4 . 00 

0.9002 

7.517 

15 

0.00 

8.00 

0. 8991 

7.626 

16 

0.25 

4.00 

0.9026 

7.497 

17 

0.96 

4 .76 

0.8987 

7.526 

18 

2.05 

6.85 

0.9030 

7.600 

19 

2.  25 

8.00 

0.9000 

7.622 

20 

3.27 

9.62 

0.9025 

7.653 

DEPARTURES  FROM  NORMALITY  ASSUMPTION 
SANOVA  TEST  # 3 

I 
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As  seen  from  these  tables 
assumption  had  very  little  eff 
the  test  at  the  null  hypothesi 
sample  test,  the  effect  of  kur 
larger  than  that  of  skewness. 


the  normality 
ect  on  the  power  of 
s.  As  in  the  fixed 
tosis  appears  to  be 


I I 1 

' I 1 

l 

_ _ ^ 


4.3  CONCLUSION 


This  chapter  of  the  thesis  has  consisted  of  an 
investigation  of  the  robustness  of  sequential  analysis 
of  variance  tests,  specifically,  how  the  power  of  such 
a test  is  affected  by  deviations  from  the  assumptions. 
The  assumptions  considered  were  homogeneity  of  variance 
and  normality. 

The  studies  conducted  have  shown  that  the  SANOVA 
test  is  remarkably  robust.  This  might  be  due  in  part 
to  the  experimental  procedure  of  a SANOVA  test,  where 
at  each  stage  of  the  experiment  an  equal  number  of 
observations  from  each  of  the  k groups  has  been  taken. 
And,  as  studies  of  the  fixed  sample  ANOVA  test  have 
shown,  deviations  have  minimal  effects  when  all  group 
sample  sizes  are  equal. 

These  findings  chould  hopefully  increase  the 
number  of  problems  for  which  SANOVA  may  be  useful. 


CHAPTER  5 


CONCLUSION 


This  thesis  has  developed  procedures  for  obtaining 
the  properties  of  a sequential  analysis  of  variance 
test  (SANOVA) . The  major  properties  of  interest  are 
the  operating  characteristic  (OC)  and  average  sample 
number  (ASN ) curves.  These  two  curves  are  extremely 
important  since  they  allow  one  to  design  a SANOVA 
test  prior  to  experimentation. 

For  a test  comparing  only  two  means,  this  thesis 
has  developed  an  exact  procedure  for  obtaining  the  OC 
and  ASN  curves.  Previous  methods  have  all  been  approx 
imate,  the  most  common  of  them  being  Monte  Carlo  simula- 
tion. My  exact  procedure  uses  Aroian's  direct  method 
of  sequential  analysis  and  has  been  developed  in  suffi- 
cient detail  to  serve  as  a valuable  tool  for  designing 
exper iments . 

For  SANOVA  tests  involving  more  than  two  means, 
a new  procedure  has  been  developed  in  which  the  sequenti 
test  probabilities  of  acceptance,  rejection,  and  con- 
tinuation are  approximated  by  multivariate  normal  (MVN) 
probabilities.  This  is  accomplished  by  transforming 
the  original  SANOVA  test  statistic  and  regions  so  as  to 
(approximately)  normalize  the  test  statistic.  During 
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the  course  of  my  research  in  this  area  the  moment 

generating  function  of  a new  type  of  noncentral 
2 

bivariate  x distribution  was  derived.  Further 
research  is  required  to  obtain  the  multivariate 
extension  as  well  as  an  expression  for  the  density 

2 

function.  In  addition,  the  ratios  of  the  modified  x 
variates  define  a new  type  of  noncentral  F distribution. 

The  MVN  approximation  gave  values  in  close 
conformance  to  those  generated  by  direct  Monte  Carlo 
simulation  for  a variety  of  parameter  sets,  and  this 
procedure  may  eventually  supplant  Monte  Carlo  simulations 
as  various  approximations  for  multivariate  normal 
probabilities  are  further  developed.  During  the  inves- 
tigation of  this  topic  of  the  thesis  one  such  approxi- 
mation was  developed.  Although  this  was  beyond  the 
scope  of  this  thesis,  preliminary  research  has  shown 
that  the  procedure  may  be  useful. 

An  experimenter  is  often  cor'erned  whether  the 
assumptions  are  satisfied  for  a SANOVA  test.  Much 
research  has  been  conducted  as  to  the  robustness  of 
fixed  sample  tests,  but  relatively  few  papers  have 
appeared  on  the  robustness  of  sequential  tests. 

Studies  in  this  thesis  revealed  that  a SANOVA  test  is 
robust  to  deviations  in  both  the  normality  and  homo- 
geneity of  variance  assumptions.  This  finding  should 
broaden  the  experimental  situations  to  which  SANOVA 
is  applicable. 
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APPENDIX  D 

DERIVATIONS  NECESSARY  FOR  THE 
MULTIVARIATE  NORMAL  APPROXIMATION 


D.l  DERIVATION  OF  THE  JOINT  DISTRIBUTION  OF  D AND  D 

n m 


As  discussed  in  Section  (3.2)  of  the  thesis,  the 


SANOVA  test  uses  the  following  V^  statistic: 


n £(X. 


V 


1-1 


i (n)  (n) 


X,  )2/o2 


K n 


2 , _ 2 


Z Z W /0 

i-1  J.l 


The  statistics  D and  D of  any  two  stages  n 

n m 1 

and  m (m  > n)  are  not  independent.  In  fact 


m 


Z Z ixi5-  w2/°; 

i = 1 j = l 


K n 


2 X <x  • ~ x , 

i = l j-1  13  l(m) 


. 2 . 2 
) /C 


n + I"  Z (Xirxuml)2|/ 


= D 


D + W 
n 


■ 
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1 


I 


It  can  easily  be  shown  that  each  of  these  quantities  is 
marginally  distributed  as: 

Dn  ' *2<“2n> 

W ~ X2(v2) 

Dra  ~ *2(u2m> 

where 

v_  = K(n-l) 

2n 

v2  = K(m-n) 

v-  = K(m-l) 

2m 

Since  W and  are  independent,  their  joint 

2 

distribution  is  simply  the  product  of  two  \ densities: 


f ( W , D 


wMv2-2)  d HIV  -2)  ,->»(W+Dn) 
n 

T(v  / 2 ) T(v  /2)  2 (v2+^v2n} 

2 n 2 


Since 


W = D - D 
m n 


the  joint  density  of  and  is  given  by: 


f (D„'DJ 

n m 


(D  -D  )^2-2)  D ^2n'2)  e'^m 
m n n 

r(v2n/2)  r (v2/2)  2 (V  ’ir? 
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which  upon  substituting  the  values  for  \>^  and  V2n 
becomes : 


f (D 

n 


D ) 


m 


_D  fc(K<n,-n)-2>  «i(K(n-l)-2)  ~V> 

m n n 

(K(n-l)/2)  (K(m-n)/2)  2*sK(m‘1) 


for  D > D 
m — n 

(D.l. 1) 


The  joint  raw  moments  are  obtained  by  performing  the 
following  integration  of  the  density  of  equation  (D.1.1): 


The  result  of  this  integration  yields  the  following 
closed  form  expression: 


r (r+*sK  (n-1) ) r ( r+s+HK  (m- 1))(  2r+s) 
r (r+^K  (m-1)  ) r(»sK(n-l)) 


(D.l. 2) 
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D . 2 DERIVATION  OF  THE  JOINT  MOMENT  GENERATING  FUNCTION 

OF  D AND  D 
n m 

The  joint  moment  generating  function  of  D^  and  D^ 
may  be  obtained  by  performing  the  following  integration: 


Upon  substitution  this  becomes  the  following: 


,D 


m 


(t. 


t2) 


r(v0/2)  r (v  / 2 ) 21*(V2+V2n) 

2.  2n 


H(v-2) 

D (D-D 

n m n 


^(V2) 


t D 

, 1 n d D dD 
n m 
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The  first  integral  is  given  in  Gradshteyn  and  Ryzhik 
(1965)  (p.  318, integral  #3.383)  as: 


M ,D  - r<v2/2>  r(v2n/2)  2'-(v2+V2,» 

n m 


-H'D  t_D  l2(\>~  + \>.-2) 

m 2 m „ , n 2 2n 

e e B(Svj,  liv2n)  Dm 


‘1%  dDr 


where 


b ( x , y ) = r(x) r(Y)  / r(x+Y) 


,F.  (X  , Y , Z ) 
1 1 


- E 


r(x+i)  r (y)  z 

T(Y+i)  T(X)  i! 


The  remaining  integral  yields  (Gradshteyn 
and  Ryzhik  (1965),  p.  860, #7.6214  ): 
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MD  ,D 
n m 


h ( v^+v,  ) 

r(v2/2)  r(v2n/2)  2 2 2n 


- h ( v + v . 


-1 


B ( v2/2 , v2n/2)  r(>»(v2+v2n))  (*s-t2)  " 2 2n 


>Fl(v2'  -(v2  + v2n)'  li(v2  + v2n)' 


where  2F^(a,b,c,Z)  is  the  Gauss  Hypergeometric  function 
(Slater  (1965)). 

The  above  expression  can  be  simplified  to  the 
following  form: 


*4  (vo  + ^->  J 


= ( 1_2t2 ) 


n m 


2F1(v2/2,  Js(v2+v2n),  4(v2  + v2n),  t1(v-t2) 


-1 
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This  may  be  further  simplified  by  noting  the  following 
identity  (Abramowitz  and  Stegun  (1964)): 


2F1(x,y<y/Z)  = (1-2) 


Performing  this  substitution  and  simplifying  yields 
the  following  expression 


“*Sv2n  ~'>v2 

\ ,D  ‘W  ■ (1-2t2)  <l-2t  -2t  ) 

n m 


(l-2t2)-'lK(n-1)(l-2t1-2t2rW("-nl 


(D. 2. 1) 


I 

I 
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D . 3 DERIVATION  OF  THE  MIXED  CENTRAL  MOMENTS  OF  D AND  D 

n m 

The  previous  sections  of  this  appendix  have  derived 

expressions  for:  the  joint  density  of  D and  D ; 

J n m 

the  mixed  raw  moments  of  D and  D ; and  the  joint  moment 

n m J 

generating  function  of  D and  D . All  three  of  these 

n m 

expressions  are  useful  for  obtaining  the  mixed  central 

moment  of  D and  D . 

n m 

The  mixed  central  moments  are  defined  as: 


(D  -u  )r  ( D - u )S 


m D 


m 


and  are  needed  for  the  Taylor  series  approximation  to 

cov  (Z  ,Z  ) of  Section  3.2. 
n m 

The  covariance  is  obtained  as 


cov  (D  D ) = E 

D D 

- E 

[d1 

E 

D 

n m 

n m 

nJ 

m 

which  upon  substituting  the  results  of  equation  (D.1.2) 
yields 


cov  (D  D ) 
n m 


= 2K (n-1)  + a (m-1) (n-1)  - K*(m-l)(n-l) 


= 2K (n-1) 


(D.3.1) 


j 


I 


Similar  expressions  were  obtained  for  the  higher 
central  moments  and  are  summarized  in  Table  15. 

I 

I 

I 

I 


I 
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D.4.  DERIVATION  OF  THE  JOINT  MOMENT  GENERATING  FUNCTION 

OF  T AND  T 
n m 


The  statistic  T is  given  by 
n 


- " £ <Xi(n)  - X (n)>2/02 


where 


X • , . 
1 (n) 


- £ x.. 


x , , = y.  x . , . 

(n)  f— 4 i ( n) 


The  joint  moment  generating  function  of  the 

statistics  T and  T is  best  derived  by  using  the 
n m 

following  set  of  transformations: 


U1  = (Xl(n)+  X2(n))//r° 


(Xl(n)+  X2(n)  ' 2X3(n) }/  ° 


UK-1  = (Xl(n)  + X2(n)  + * ’ ,+  XK-1  (n) _ (K_1)  ^(n^  ^K(K"1)'  0 


l 
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and 


(Xl(m)  “ X2(m) }/  ^ 0 
(Xl(m)  + X2(m)  “ 2X3(m)/’^'  0 


Vl=  (Xl(m)+X2(m)  + '-'+XK-l(m)"(K"1)XK(m))  7 7 K(K_1)‘  ° 

(D.  4. 1) 


These  transformations  are  those  used  by  Helmert 
(1876);  and  are  defined  so  that: 


K-l 


X 


E“2-  E«1 


i = 1 
X-  1 


i = 1 
X 


- X > 2 /a2 

(n)  (n) 


,»)  ' \«,>)2/°2 


i=  1 


i = 1 


0.4.2) 


In  addition  it  can  easily  be  shown  that: 

var(U^)  = 1/n 

var(Vj)  = 1/m 

cov(Ui,Uj)  = 0 


cov (Vi ,Vj 


0 
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D-15 


I 

1 

I 

I 


and  variance  covariance  matrix  *’  where  can  be 

expressed  as  the  following  partitioned  matrix: 


E 


r-  i 

n 

_1_ 

m 

1 _ 

_1_ 

I 

m 

m 

(D. 4. 5) 


I being  a K-l  x K-l  identity  matrix. 
The  density  of  X is  given  by 


f (X) 


2tt)  "fK_1)  | X I~'2  exp  j-hr(X-wr£  (X-W)J 


The  joint  moment  generating  function  of  T and  T 
J 3 ’ n m 

may  then  be  obtained  by  performing  the  following  multi- 
dimensional integration: 


MT  ,T  {tl,t2) 
n m 


( 2 TT) 


- (K-l)  |-»5 


exp  jtjX'Ax}  exp  j t2X'axJ 

exp  | - S (X-wfE_1(X-W)  | dX  ^ • *dX  2K_  . 


(D.  4 . 6) 


I 
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I 

i 

1 


where  A and  B are  matrices  which  may  be  expressed 
in  the  following  block  partitioned  forms: 


A:  2K- 2 x 2K-2 


B:  2K- 2 x 2K-2 


nl 

<P 

4> 

<t> 

<P 

ml 

(D.4.7) 


where  I is  again  the  K-l  x K-l  identity  matrix  and 
<J)  is  the  K-l  x K-l  null  matrix. 

The  integration  of  (D.4.6)  is  performed 
in  Graybill  (1969)  (p.  252)  and  yields 


«T  T t J 
n m 


i-2t1EA-2t2EB  l's 

exp|(H)  (X)'1  W)  ' (S'1-2  A_2t  (_D  (W‘)5Z  _1  W | 


ID. 4. 8) 


The  determinant  portion  may  be  simplified  to  the 
following  block  partitioned  form: 


I-2tiEA-2t2Z: 


(l-2t1) 

I 

-2t2  I 

-2t.  (— ) 
lv  m ‘ 

I 

(l-2t2)  I 
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which  yields 


Graybill  (19  69)  (p.  165) 


(l-2t. 


k-1 


(l-2t2)I  - 


4tlfc2n 

(l-2t1)m 


| (l-2t1)  (l-2t2)  - 4nt1t2m”1^  ^(K  1) 


(D. 4.9) 


This  expression  is  similar  to  that  obtained  by  Kibble 
(1945)  in  his  derivation  of  a moment  generating  function 
for  a type  of  bivariate  gamma  distribution. 

The  exponent  may  be  rearranged  as  follows: 


E"1w)'(E‘1-2tlA-2t2B)'1(5:'1w)(-‘'  )£'lw 


- w-L 


^I-2t1Xi  A-2t2^B 


-1 


w 


The  inner  product  can  be  simplified  to  the  following 
block  partitioned  matrix 


£ 1 ( 1-2^2  A-2t2£B ^ -ij  = a 


r 

-I 

riiJ 

r12: 

. r21J 

r22X  . 

R 


where 


rll  = m n ( l~2t2)  -m  n ( l-2t^)(  l-2t^)  + 4mn  t^t2~2mn 
r = 2m2nt2-m2n(l-2t1)+m2n(l-2t1) (l-2t2)-4t1t2mn2 

r ^ = 2m^nt  ^-m2n  ( 1- 2 12 ) +m2n  ( 1- 2t  ^ ) ( 1- 2t2 ) - 4 1 ^t2mn2 

= m3  ( l-2t^) -m2  ( l-2t  ^ ) ( l-2t2 ) + 4t  1t2m2n-2m2nt2 
[m(l-2t^)  ( l~2t0) -4t1t2 

L 

Now,  the  vector  W is  symmetrical  in  that  the  first 
K-l  elements  are  identical  to  the  second  K-l  elements 
Denote  this  symmetry  by 

W ' = M ' | W 


“ ” Wlrlltr12+r21*r22> 


The  quantity  w'w  is  simply  X,  so  the  above 
becomes 

■J  2nt1+2mt2~4t1t2  (m-n)  | 
f (1— 2t, ) (l-2t.)-4t,t^nm"1l 


X 
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Thus,  the  moment  generating  function  is  given  by 
the  following  expression: 


M (tl't2) 

n m 


{ (l-2t1) (l-2t2)-4t1t2nm“1  J 


x , -*»(K-1) 


exp  < 


2nt1+2mt2~4t^t2 (m-n) 


(l-2t1)  ( 1- 2 1 2 ) -4t1t2nm 


■11 


(D. 4 . 10) 


This  is  the  moment  generating  function  of  a type 


of  noncentral  bivariate  x distribution. 
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D . 5 THE  MIXED  CENTRAL  MOMENTS  OF  T AND  T 

n m 


The  mixed  central  mements  of  T and  T ; i.e. 

n m 


(T  -u  ) r (T  -u  ) 
n WTn  m PTm 


are  best  obtained  from  the  joint  cumulant  generating 

function  of  T and  T . The  joint  cumulant  generating 
n m 

function  is  obtained  as  the  logarithm  of  the  moment 
generating  function  given  in  equation  (D.4.10)  of  the 
previous  section.  This  yields  the  following  expression: 


KT  ,T  (tl't2)  log 

n m 


K ,T  (tl,t2)) 

\ n m / 

f(l-2t1) (l-2t2)-4t1t2nm_1 j 
^nt1+mt2~2t1t2  (m-n)J  £ ( l-2t  1 ) ( 1- 2fc2 ) - 4 t^^nm 


= - '2  ( K - 1 ) log 


(d.  5.  i; 


By  differentiating  or  expanding  this  function  one  can 
obtain  the  joint  cumulants.  In  particular 


cov  (T  , T ) 
n m 


3 KTn.Tmltl,t2) 
St!9t2 


‘l-  ° 
t2-  0 
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TABLE  16 


MIXED  CENTRAL  MOMENTS  OF  T AND 

n 


E (T  -u  )r(T  -u,,,  )S 
n MT  1 m hT 
n m 


T 

m 


r 


Moment 


0 

0 

1 

1 

2 

2 

3 


2 

3 

1 

2 

0 

1 

0 


2 (K-l+2mA) 


8 (K- 1+  3mA ) 


2 (K-l) nm-1+4nA 


24nA+4 (K-l) (3nm_1-l) 


2 (K- 1+  2nA ) 


8n(k-l)m  ^ + 24nA- 8 (m-n) Am  ^ 


8 (k-l+3nA) 
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